Skip Menu |
 

This queue is for tickets about the XML-RSS CPAN distribution.

Report information
The Basics
Id: 21740
Status: resolved
Worked: 40 min
Priority: 0/
Queue: XML-RSS

People
Owner: SHLOMIF [...] cpan.org
Requestors: nutlet [...] karelia.ru
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 1.10
Fixed in: (no value)



Subject: wrong handling enclosure subelement of item
Download (untitled) / with headers
text/plain 1.1k
According to rss 2.0 specification, 'enclosure' - subelement of 'item' - is an empty xml-element with few attributes. F.e.: <enclosure url="http://www.scripting.com/mp3s/weatherReportSuite.mp3" length="12216320" type="audio/mpeg" /> XML::RSS looses all attributes of this element. Here is the quick patch to fix this: *** RSS_original.pm 2006-03-12 02:47:19.000000000 +0300 --- RSS.pm 2006-09-27 12:29:41.000000000 +0400 *************** sub handle_start { *** 1505,1510 **** --- 1505,1515 ---- push(@{$self->{'items'}->[$self->{num_items}-1]-> {'taxo'}},$attribs{'resource'}); $self->{'modules'}-> {'http://purl.org/rss/1.0/modules/taxonomy/'} = 'taxo'; + # beginning of enclosure element in item + } elsif ($el eq 'enclosure' && $self->within_element('item')) { + + $self->{'items'}->[$self->{num_items}-1]->{'enclosure'} = {map {$_ => $attribs{$_}} keys %attribs}; + # beginning of taxo li in channel element } elsif ($self->within_element($self->generate_ns_name ("topics",'http://purl.org/rss/1.0/modules/taxonomy/')) && $self->within_element($self->generate_ns_name ("channel",$self->{namespace_map}->{'rss10'}))
Subject: RSS.pm.patch
Download RSS.pm.patch
text/x-diff 842b
*** RSS_original.pm 2006-03-12 02:47:19.000000000 +0300 --- RSS.pm 2006-09-27 12:29:41.000000000 +0400 *************** sub handle_start { *** 1505,1510 **** --- 1505,1515 ---- push(@{$self->{'items'}->[$self->{num_items}-1]->{'taxo'}},$attribs{'resource'}); $self->{'modules'}->{'http://purl.org/rss/1.0/modules/taxonomy/'} = 'taxo'; + # beginning of enclosure element in item + } elsif ($el eq 'enclosure' && $self->within_element('item')) { + + $self->{'items'}->[$self->{num_items}-1]->{'enclosure'} = {map {$_ => $attribs{$_}} keys %attribs}; + # beginning of taxo li in channel element } elsif ($self->within_element($self->generate_ns_name("topics",'http://purl.org/rss/1.0/modules/taxonomy/')) && $self->within_element($self->generate_ns_name("channel",$self->{namespace_map}->{'rss10'}))
Subject: Re: [rt.cpan.org #21740] wrong handling enclosure subelement of item
Date: Wed, 27 Sep 2006 03:03:08 -0700
To: bug-XML-RSS [...] rt.cpan.org
From: Ask Bjørn Hansen <ask [...] perl.org>
Download (untitled) / with headers
text/plain 297b
On Sep 27, 2006, at 1:41 AM, Alexei Kozlov via RT wrote: Show quoted text
> According to rss 2.0 specification, 'enclosure' - subelement > of 'item' - is an empty xml-element with few attributes.
Hi, Any chance you can make a small test we can include in the test suite? - ask -- http://log.perl.org/
From: nutlet [...] karelia.ru
Download (untitled) / with headers
text/plain 139b
Show quoted text
> Hi, > > Any chance you can make a small test we can include in the test
suite? Hi! Here is the test for enclosure element. Alexei.
use strict; use Test::More; use constant RSS_VERSION => "2.0"; use constant RSS_ENCLOSURE_URL => qq(http://www.scripting.com/mp3s/weatherReportSuite.mp3); use constant RSS_ENCLOSURE_LENGTH => qq(12216320); use constant RSS_ENCLOSURE_TYPE => qq(audio/mpeg); use constant RSS_DOCUMENT => qq(<?xml version="1.0"?> <rss version="2.0"> <channel> <title>Example 2.0 Channel with Enclosure sub-element of Item</title> <link>http://example.com/</link> <description>To lead by example</description> <language>en-us</language> <copyright>All content Public Domain, except comments which remains copyright the author</copyright> <managingEditor>editor\@example.com</managingEditor> <webMaster>webmaster\@example.com</webMaster> <docs>http://backend.userland.com/rss</docs> <category domain="http://www.dmoz.org">Reference/Libraries/Library_and_Information_Science/Technical_Services/Cataloguing/Metadata/RDF/Applications/RSS/</category> <generator>The Superest Dooperest RSS Generator</generator> <lastBuildDate>Mon, 02 Sep 2002 03:19:17 GMT</lastBuildDate> <ttl>60</ttl> <item> <title>News for September the Second</title> <link>http://example.com/2002/09/02</link> <description>other things happened today</description> <comments>http://example.com/2002/09/02/comments.html</comments> <author>joeuser\@example.com</author> <pubDate>Mon, 02 Sep 2002 03:19:00 GMT</pubDate> <guid isPermaLink="true">http://example.com/2002/09/02</guid> <enclosure url="http://www.scripting.com/mp3s/weatherReportSuite.mp3" length="12216320" type="audio/mpeg" /> </item> </channel> </rss>); plan tests => 8; use_ok("XML::RSS"); my $xml = XML::RSS->new(); isa_ok($xml,"XML::RSS"); eval { $xml->parse(RSS_DOCUMENT); }; is($@,'',"Parsed RSS feed"); cmp_ok($xml->{'_internal'}->{'version'},"eq",RSS_VERSION,"Is RSS version ".RSS_VERSION); cmp_ok(ref($xml->{items}),"eq","ARRAY","\$xml->{items} is an ARRAY ref"); if($xml->{items} && ref($xml->{items}) eq 'ARRAY'){ my $item = shift @{$xml->{items}}; if($item->{enclosure} && ref($item->{enclosure}) eq 'HASH'){ my $encl = $item->{enclosure}; cmp_ok($encl->{'url'},"eq",RSS_ENCLOSURE_URL, "ENCLOSURE URL is ".RSS_ENCLOSURE_URL); cmp_ok($encl->{'length'},"eq",RSS_ENCLOSURE_LENGTH, "ENCLOSURE URL is ".RSS_ENCLOSURE_LENGTH); cmp_ok($encl->{'type'},"eq",RSS_ENCLOSURE_TYPE, "ENCLOSURE URL is ".RSS_ENCLOSURE_TYPE); }else{ ok(0,"Parsing Enclosure element, sub-element of Item"); } } __END__ =head1 NAME 2.0-parse.t - tests for parsing RSS 2.0 data with XML::RSS.pm =head1 SYNOPSIS use Test::Harness qw (runtests); runtests (./XML-RSS/t/*.t); =head1 DESCRIPTION Tests for parsing RSS 2.0 data with XML::RSS.pm =head1 VERSION $Revision: 1.2 $ =head1 DATE $Date: 2002/11/19 23:56:53 $ =head1 AUTHOR Aaron Straup Cope =head1 SEE ALSO http://backend.userland.com/rss2 =cut
From: nutlet [...] karelia.ru
Download (untitled) / with headers
text/plain 347b
Just a little note: enclosure element can't contain cdata section. You should skip all cdata for enclosure in 'handle_char' handler. also, all three attributes of enclosure are required - maybe it's good to validate it in 'handle_start' and warn (or even die) with error http://blogs.law.harvard.edu/tech/rss#ltenclosuregtSubelementOfLtitemgt
Download (untitled) / with headers
text/plain 501b
On Thu Sep 28 04:28:53 2006, nutlet wrote: The enclosure bug has been fixed (RT#7920). Show quoted text
> Just a little note: > enclosure element can't contain cdata section. You should skip all > cdata for enclosure in 'handle_char' handler. also, all three > attributes of enclosure are required - maybe it's good to validate it > in 'handle_start' and warn (or even die) with error > > http://blogs.law.harvard.edu/tech/rss#ltenclosuregtSubelementOfLtitemgt
Can you provide a test for that, then I'll fix it.
Download (untitled) / with headers
text/plain 222b
We already applied one patch. As for the other suggestions of validating the input - I don't think they fall into the scope of the parser. Also, the original submitter has been unresponsive for over two years. So closing.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.