Skip Menu |
 

This queue is for tickets about the HTTP-Message CPAN distribution.

Report information
The Basics
Id: 82963
Status: resolved
Priority: 0/
Queue: HTTP-Message

People
Owner: Nobody in particular
Requestors: MAUKE [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Wishlist
Broken in: 6.06
Fixed in: (no value)



Subject: ->decoded_content should decode application/json, etc
Download (untitled) / with headers
text/plain 336b
Currently $response->decoded_content will decode the bytes of e.g. "Content-type: text/json; charset=UTF-8" messages because it knows "text/*" is ... text. It would be nice if this could be extended to also decode the text for content-types such as "application/json; charset=UTF-8", "application/javascript; charset=ISO-8859-15", etc.
Download (untitled) / with headers
text/plain 583b
On Fri Jan 25 17:13:06 2013, MAUKE wrote: Show quoted text
> Currently $response->decoded_content will decode the bytes of e.g. > "Content-type: text/json; charset=UTF-8" messages because it knows > "text/*" is ... text. > > It would be nice if this could be extended to also decode the text for > content-types such as "application/json; charset=UTF-8", > "application/javascript; charset=ISO-8859-15", etc.
Bump, just ran into the same issue after a few hours. in HTTP::Headers->content_is_text, shouldn't the presence of charset in the content-type imply that the content is characters, ie text?
Download (untitled) / with headers
text/plain 396b
Second on this. When I say decode, I know what I am doing - currently there is no way to force it. $response->decoded_content(charset => 'utf-8') Adding (charset_strict => 1, raise_error => 1) doesn't help. Better yet, the content type I get is Content-Type: application/json; charset=UTF-8 Maybe content_is_text() should returns true if the charset is present in the content-type header?
Download (untitled) / with headers
text/plain 802b
Third. Currently, the code says: if ($self->content_is_text || (my $is_xml = $self->content_is_xml)) { Examples where LWP currently breaks include: application/json application/yaml application/x-yaml application/pdf application/* (that isn't +xml) The Content-Type really shouldn't matter. If the Content-Type is "pork/beans; charset=UTF-8", it should still be decoded. If the remote agent broadcasted a charset, it's telling us that it had encoded that data with that character set. We shouldn't care if the data inside the onion is text, audio, application-specific, some proprietary format, whatever. Please remove this 'if' line. It's a pretty intelligent interface, so it would be a waste of code to have other folks design their own decoding interface just because of this restriction.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.