|Subject:||WWW::RobotRules/LWP::RobotUA Does Not Respect Crawl-delay:|
Hi. This is imacat from Taiwan. I was trying LWP::RobotUA, and found that WWW::RobotRules does not respect Crawl-delay:. The test script is (an exact copy in WWW::RobotRules's POD): ========== #! /usr/bin/perl -w use WWW::RobotRules; my $rules = WWW::RobotRules->new('MOMspider/1.0'); use LWP::Simple qw(get); my $url = ""; my $robots_txt = get $url; $rules->parse($url, $robots_txt) if defined $robots_txt; ========== The result I got is: ========== imacat@rinse ~/tmp % ./test.pl RobotRules < >: Unexpected line: Crawl-delay: 10 RobotRules < >: Unexpected line: Crawl-delay: 2 RobotRules < >: Unexpected line: Crawl-delay: 2 imacat@rinse ~/tmp % ========== Crawl-delay: is a popular instruction that is used all over the world, and is obeyed by Yahoo, MSN and many robots. A package written with LWP::RobotUA with such a warning all the time could not be used. This would make LWP::RobotUA quite useless. Besides, if a website has specified Crawl-delay:, LWP::RobotUA should respect it instead of its own $ua->delay(). Could you look into this and fix this soon? Thank you.