Skip Menu |
 

This queue is for tickets about the XML-SAX-PurePerl CPAN distribution.

Report information
The Basics
Id: 19411
Status: new
Priority: 0/
Queue: XML-SAX-PurePerl

People
Owner: Nobody in particular
Requestors: clinton [...] traveljury.com
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: (no value)
Fixed in: (no value)



Subject: PurePerl not setting the utf8 flag
Download (untitled) / with headers
text/plain 738b
Two RSS feeds, both encoded in ISO-8859-1. Feed One contains: a literal british pound character <title>Extra £2.5m for July bomb victims</title> Feed Two contains : a character entity reference <title>Boots injects &#xA3;3.6m to help area where it closed down factory</title> Feed Two, when parsed, returns a literal pound character (with encode_entities --> &pound;) Feed One, when parsed, returns a UTF8 string which is not marked as such, so encode_entities --> &Acirc;&pound; However, if (for feed One), you parse it, then Encode::decode('utf8',$item->title), it interprets it correctly. Sorry if that is confusing : essentially, it is returning UTF8 characters, but without the utf8 flag set. The libXML parser works fine.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.