Skip Menu |
 

This queue is for tickets about the WWW-RobotRules CPAN distribution.

Report information
The Basics
Id: 99387
Status: new
Priority: 0/
Queue: WWW-RobotRules

People
Owner: Nobody in particular
Requestors: lindahl [...] pbm.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 6.02
Fixed in: (no value)



Subject: Additional googlebot incompatibility
Download (untitled) / with headers
text/plain 842b
blekko got flamed by webmasters until we parsed robots.txt like google does. There's already a bug 68219 (https://rt.cpan.org/Public/Bug/Display.html?id=68219) about * and Allow. The additional things we were flamed about are: 1) blank lines should be ignored. Webmasters frequently have stuff like User-agent: googlebot Disallow: / And expect the disallow to be applied to googlebot and not *. Same for User-agent: googlebot # a comment Disallow: / 2) Trailing $ Disallow: .mp3$ should in fact disallow /foo.mp3 I would be happy to donate our testsuite. I don't think anyone should be using a non-googlebot-compatible robots.txt parser these days. But if you want to keep a useless but standard-compliant mode around, it's easy enough to divide the tests up into the ones that obey the standard and the ones that obey the reality.
Subject: Re: [rt.cpan.org #99387] AutoReply: Additional googlebot incompatibility
Date: Wed, 8 Oct 2014 14:02:09 -0700
To: Bugs in WWW-RobotRules via RT <bug-WWW-RobotRules [...] rt.cpan.org>
From: lindahl [...] pbm.com
Download (untitled) / with headers
text/plain 12.3k

Message body is not shown because it is too large.



This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.