Skip Menu |
 

This queue is for tickets about the XML-RSS-Feed CPAN distribution.

Report information
The Basics
Id: 50467
Status: resolved
Worked: 10 min
Priority: 0/
Queue: XML-RSS-Feed

People
Owner: jbisbee [...] cpan.org
Requestors: sven.knispel [...] pobox.com
Cc: Dan [...] DWright.Org
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: 2.4



MIME-Version: 1.0
X-Spam-Status: No, hits=-0.0 required=8.0 tests=DK_SIGNED,DK_VERIFIED,SPF_PASS
content-type: text/plain; charset="utf-8"
Message-ID: <4AD50AC4.3050902 [...] pobox.com>
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by diesel.bestpractical.com (Postfix) with SMTP id 68F9F19B81DA for <bug-XML-RSS-Feed [...] rt.cpan.org>; Tue, 13 Oct 2009 19:18:43 -0400 (EDT)
Received: (qmail 14807 invoked by uid 103); 13 Oct 2009 23:18:42 -0000
Received: from x16.dev (10.0.100.26) by x1.dev with QMQP; 13 Oct 2009 23:18:42 -0000
Received: from b-pb-sasl-sd.pobox.com (HELO sasl.smtp.pobox.com) (64.74.157.63) by 16.mx.develooper.com (qpsmtpd/0.80) with ESMTP; Tue, 13 Oct 2009 16:18:36 -0700
Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by b-pb-sasl-sd.pobox.com (Postfix) with ESMTP id 236456B9C5; Tue, 13 Oct 2009 19:18:33 -0400 (EDT)
Received: from b-pb-sasl-sd.pobox.com (unknown [127.0.0.1]) by b-pb-sasl-sd.pobox.com (Postfix) with ESMTP id 145856B9C4; Tue, 13 Oct 2009 19:18:32 -0400 (EDT)
Received: from ws-sven.localdomain (unknown [93.194.203.105]) by b-pb-sasl-sd.pobox.com (Postfix) with ESMTPA id 3F60E6B9C3; Tue, 13 Oct 2009 19:18:29 -0400 (EDT)
Delivered-To: cpan-bug+XML-RSS-Feed [...] diesel.bestpractical.com
User-Agent: Thunderbird 2.0.0.23 (X11/20090817)
Subject: Misbehaviour in XML::RSS::Feed, mixup in Headline id/guid
Return-Path: <sven.knispel [...] pobox.com>
Domainkey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=message-id:date :from:mime-version:to:subject:content-type :content-transfer-encoding; q=dns; s=sasl; b=LFkdP4WmmD58fsYhek0 bNPMdE2r4zpO+Jn5gtZ3j6NbaIcND313ttpcg7zz+xKmp9PZTffQFYZ+L6z/cdB8 ObWrPfgo2zATzMoVbbdLFb88ln/o1rf27fz+k+Ysc7oOgv5zhhAmLsqADzDGcK77 IXQCg7Dz933QYq3gJZo1jTjw=
X-Original-To: bug-XML-RSS-Feed [...] rt.cpan.org
X-Spam-Check-BY: 16.mx.develooper.com
Dkim-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=message-id :date:from:mime-version:to:subject:content-type :content-transfer-encoding; s=sasl; bh=0EK4muYxLrCQTxtZfrk66Sd9U OA=; b=ktgXpNaF36BGY4J65xLIR9T9zu6LPM/2eXpMkZ+UDoUu7zzhYfCSBt7eD WAbjMgg/gHdNbdGkveAmU6mFSw4DdZCbQrQil3SfMwkQ33eSF0bnOShZ48xyPAcV eW+SINFHAU49aKUR7bQVC9l//N7RM26Kl2NHAsdz9fg4DCTU0M=
Disposition-Notification-To: Sven Knispel <sven.knispel [...] pobox.com>
Date: Wed, 14 Oct 2009 01:18:28 +0200
X-Spam-Level: *
To: bug-XML-RSS-Feed [...] rt.cpan.org
X-Pobox-Relay-ID: B91D3B5C-B84E-11DE-845C-DC8AA4293987-95068357!b-pb-sasl-sd.pobox.com
Content-Transfer-Encoding: 7bit
X-Enigmail-Version: 0.96.0
From: Sven Knispel <sven.knispel [...] pobox.com>
X-RT-Original-Encoding: ISO-8859-1
Content-Length: 2755
Download (untitled) / with headers
text/plain 2.6k
Dear Jeff, after having spent the two last nights frying to find out about a difference in behavior of XML::RSS::Feed on my pc and on a friend's I finally had a breakthrough. To sum it up: on version 2.212 everything is fine, on version 2.32 not anymore. Let me elaborate a little on "everything". with the adapted example from the POD: use XML::RSS::Feed; use LWP::Simple qw(get); my $feed = XML::RSS::Feed->new( url => "http://feeds.wired.com/wired/index", name => "Wired", delay => 10, debug => 1, tmpdir => ".", ); while (1) { $feed->parse(get($feed->url)); print $_->headline . "\n" for $feed->late_breaking_news; sleep($feed->delay); } Ok, the expected behavior (with V2.212): - first run: it fetches whatever is in the feed (30 items), and keeps going in the loop with no new items. - second run: after having retrieved the cached items there is no breaking news so it goes on telling "no headlines found". And now the problem (with 2.32): - first run: it fetches whatever is in the feed (30 items), and keeps going in the loop with no new items. - second run: after having retrieved the cached items it still sees another 30 breaking news items and shows them again. At every run the number of initialized headlines from the cache increases by 30. After a few hours and lots of coffee I broke the problem down to the Headlines. In the newer 2.32 version of headlines there is the concept of guid that didn't exist in older version. I found that the "faulty" code is in Headlines.pm in "sub id" on "return $self->guid || $self->url;". For whatever reason $self->guid is not set prior to caching or read from cache (at least my assumption). Anyway, always returning the URL solves the misbehavior. And finally without modifying the code doing a "$feed->init_headlines_seen;" in the calling program does also as obviously it replaces the logic for setting/getting Headline id. The program working for me is: use XML::RSS::Feed; use LWP::Simple qw(get); my $feed = XML::RSS::Feed->new( url => "http://feeds.wired.com/wired/index", name => "Wired", delay => 10, debug => 1, headline_as_id => 1, # <-- avoids getting "real" headline it tmpdir => ".", ); while (1) { $feed->parse(get($feed->url)); print $_->headline . "\n" for $feed->late_breaking_news; sleep($feed->delay); } Now I suspect "sub _build_dump_structure" to be candidate to store guid together with url to solve the problem but I lack background on RSS so please excuse me if I am completely wrong (it would be nice to read your opinion on this whole thing ;-) ). Brgds Sven
MIME-Version: 1.0
In-Reply-To: <4AD50AC4.3050902 [...] pobox.com>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <4AD50AC4.3050902 [...] pobox.com>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.14-27756-1375287446-1598.50467-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 559
Download (untitled) / with headers
text/plain 559b
Hello, I'm seeing a very similar issue when trying to use XML::RSS:Feed. The problem I'm seeing is that whenever you load headlines from the cache, they no longer contain the guid field. This can be fixed with a one line change inside _build_dump_structure(): push @{ $cached->{items} }, { headline => $headline->headline, url => $headline->url, description => $headline->description, first_seen => $headline->first_seen_hires, Show quoted text
> guid => $headline->guid,
};
MIME-Version: 1.0
In-Reply-To: <4AD50AC4.3050902 [...] pobox.com>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <4AD50AC4.3050902 [...] pobox.com>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.16-7279-1375637598-1883.50467-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 443
Download (untitled) / with headers
text/plain 443b
Thanks for taking time to report these issues. Fixed * add =encoding utf-8 to pod to fix RT issue #78918 * add guid to serialization so we can properly restore it to fix RT issue #50467 * Fix blantantly broken test * Suppress warnings on deprecated methods during tests * Fix pod coverage issues with Feed::Factory * I wrote this code so long ago it makes me throw up in my mouth just a little bit :P -- Jeff Bisbee / jbisbee@cpan.org


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.