Skip Menu |
 

This queue is for tickets about the libwww-perl CPAN distribution.

Report information
The Basics
Id: 43507
Status: resolved
Priority: 0/
Queue: libwww-perl

People
Owner: Nobody in particular
Requestors: sergii [...] pisem.net
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 5.821
Fixed in: (no value)



Subject: HTTP::Message::decoded_content fragile charset detection
Download (untitled) / with headers
text/plain 906b
HTTP::Message::decoded_content takes the last element of the $self->header("Content-Type") array, and expects it to contain the charset. I'm fetching data from a site that has Content-Type with a charset in the HTTP headers, and Content-Type without a charset in the HTML page itself: <META HTTP-EQUIV="Content-Type" CONTENT="text/html"> As a result, $self->header("Content-Type") is ('text/html; charset=windows-1251', 'text/html') and charset is not detected: DB<16> x HTTP::Headers::Util::split_header_words($self->header("Content-Type")) 0 ARRAY(0x4a78450) 0 'text/html' 1 undef 2 'charset' 3 'windows-1251' 1 ARRAY(0x7f63b01223c0) 0 'text/html' 1 undef Suggested fix: use $r->content_type instead of $self->header("Content-Type"). That's what I use as a workaround: $r->header('Content-Type' => join(';', $r->content_type)); before calling $r->decoded_content
Download (untitled) / with headers
text/plain 127b
This part has been reworked in libwww-perl-5.827. Please report back if you still find issues with how charsets are detected.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.