This queue is for tickets about the WWW-RobotRules CPAN distribution.

Report information
The Basics
Id:
68219
Status:
new
Priority:
Low/Low

People
Owner:
Nobody in particular
Requestors:
yannick.simon [...] gmail.com
Cc:
AdminCc:

BugTracker
Severity:
(no value)
Broken in:
(no value)
Fixed in:
(no value)



From yannick.simon@gmail.com Sun May 15 17: 32:46 2011
MIME-Version: 1.0
X-Spam-Status: No, score=-6.209 tagged_above=-99.9 required=10 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_HI=-5, RFC_ABUSE_POST=0.001, SPF_NEUTRAL=0.779, T_TO_NO_BRKTS_FREEMAIL=0.01] autolearn=ham
X-Spam-Flag: NO
content-type: text/plain; charset="utf-8"
Message-ID: <BANLkTin332iqtVwQZ+Nw9wT2j8+t7E8cAQ@mail.gmail.com>
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
X-Spam-Score: -6.209
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id BA357241A7D for <cpan-bug+WWW-RobotRules@hipster.bestpractical.com>; Sun, 15 May 2011 17:32:46 -0400 (EDT)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id q0TuOM7DUvVb for <cpan-bug+WWW-RobotRules@hipster.bestpractical.com>; Sun, 15 May 2011 17:32:45 -0400 (EDT)
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id 9E2C5241A31 for <bug-WWW-RobotRules@rt.cpan.org>; Sun, 15 May 2011 17:32:44 -0400 (EDT)
Received: (qmail 21282 invoked by uid 103); 15 May 2011 21:32:44 -0000
Received: from x16.dev (10.0.100.26) by x1.dev with QMQP; 15 May 2011 21:32:44 -0000
Received: from mail-fx0-f50.google.com (HELO mail-fx0-f50.google.com) (209.85.161.50) by 16.mx.develooper.com (qpsmtpd/0.80/v0.80-19-gf52d165) with ESMTP; Sun, 15 May 2011 14:32:41 -0700
Received: by fxm16 with SMTP id 16so2765702fxm.9 for <bug-WWW-RobotRules@rt.cpan.org>; Sun, 15 May 2011 14:32:38 -0700 (PDT)
Received: by 10.223.3.132 with SMTP id 4mr3227649fan.132.1305495157987; Sun, 15 May 2011 14:32:37 -0700 (PDT)
Received: by 10.223.96.9 with HTTP; Sun, 15 May 2011 14:32:37 -0700 (PDT)
Authentication-Results: hipster.bestpractical.com (amavisd-new); dkim=pass header.i=@gmail.com
Authentication-Results: hipster.bestpractical.com (amavisd-new); domainkeys=pass header.from=yannick.simon@gmail.com
Delivered-To: cpan-bug+WWW-RobotRules@hipster.bestpractical.com
Subject: WWW-RobotRules parsing rules of robots.txt like googlebot
Return-Path: <yannick.simon@gmail.com>
Domainkey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=UNlWHAeB6VOF/F0GxODQl5X+uo4LFpSUMYQLyN9HXhNUkAGyRb4TqBuP5Ln9LdD16v 5+PDWr6gpG6OiKYpfcrQVi0yqBjfk+3XkjKZ2DiuLLH5lAjuFgTcocOTh/Z5KDt/p5G/ G2oHL4IFlZgl/KKgtX7jYDFyEMs/gHYTWcwDY=
X-RT-Mail-Extension: www-robotrules
X-Original-To: cpan-bug+WWW-RobotRules@hipster.bestpractical.com
X-Spam-Check-BY: 16.mx.develooper.com
Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=1d8YyZcCQUEyTlUJX9aJBVz5gdm2d4I0JdCxKUYlbDY=; b=CVRLoP4NCyc4O/x7jdHlWgHyYgDLK9o6BIMx08X2KPEe6vKnGxI4c4SNQIwRoloAqC xOt/8YZ0Uhb+S0F5ba1pCQ6Yki3TdsPhqvB3D2RIfR99PUVY6AotQ4eDHvk1k1eZx5Zq 6jQsjX//W4w5j5joPbkiQw2c40NmV9bXZcGTg=
Date: Sun, 15 May 2011 23:32:37 +0200
X-Spam-Level:
To: bug-WWW-RobotRules@rt.cpan.org
From: Yannick Simon <yannick.simon@gmail.com>
X-RT-Original-Encoding: ISO-8859-1
Content-Length: 677
Hello Thank you for this great library WWW-RobotRules the is_allowed function is "ok" for the pure robots.txt rules however, 1 - googlebot allows the rules with * characters for instance Disallow: /path/*/10 for instance, for googlebot /path/sgsdfg/10 is disallowed /path/sdfgsdfgzegz/10222D2 is disallowed (lets take a look at http://www.google.com/robots.txt) 2 - googlebot allows the "Allow" directive it would be great if there could be another "is_allowed" function for instance is_allowed_extended who acts as googlebot if you don't have time, perhaps we can imagine i try tho develop the "is_allowed_extended" function ? ;) Thank You regards Yannick


This service runs on Request Tracker, is sponsored by The Perl Foundation, and maintained by Best Practical Solutions.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.