This queue is for tickets about the WWW-RobotRules CPAN distribution.

Report information
The Basics
Id:
68219
Status:
new
Priority:
Low/Low

People
Owner:
Nobody in particular
Requestors:
yannick.simon [...] gmail.com
Cc:
AdminCc:

BugTracker
Severity:
(no value)
Broken in:
(no value)
Fixed in:
(no value)



Subject: WWW-RobotRules parsing rules of robots.txt like googlebot
Date: Sun, 15 May 2011 23:32:37 +0200
To: bug-WWW-RobotRules@rt.cpan.org
From: Yannick Simon <yannick.simon@gmail.com>
Hello Thank you for this great library WWW-RobotRules the is_allowed function is "ok" for the pure robots.txt rules however, 1 - googlebot allows the rules with * characters for instance Disallow: /path/*/10 for instance, for googlebot /path/sgsdfg/10 is disallowed /path/sdfgsdfgzegz/10222D2 is disallowed (lets take a look at http://www.google.com/robots.txt) 2 - googlebot allows the "Allow" directive it would be great if there could be another "is_allowed" function for instance is_allowed_extended who acts as googlebot if you don't have time, perhaps we can imagine i try tho develop the "is_allowed_extended" function ? ;) Thank You regards Yannick


This service runs on Request Tracker, is sponsored by The Perl Foundation, and maintained by Best Practical Solutions.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.