Skip Menu |
 

This queue is for tickets about the libwww-perl CPAN distribution.

Report information
The Basics
Id: 35912
Status: resolved
Priority: 0/
Queue: libwww-perl

People
Owner: Nobody in particular
Requestors: jjperss [...] hotmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



X-Originalarrivaltime: 15 May 2008 19:22:19.0431 (UTC) FILETIME=[FE2BAB70:01C8B6C0]
MIME-Version: 1.0
X-Spam-Status: No, hits=-0.7 required=8.0 tests=BAYES_20,HTML_MESSAGE,SPF_PASS
X-Virus-Checked: Checked by ClamAV on 16.mx.develooper.com
Importance: Normal
Content-Type: multipart/alternative; boundary="_51a471fe-c242-47e7-bafd-b398f9b4d544_"
Received: from x1.develooper.com (x1.develooper.com [63.251.223.170]) by diesel.bestpractical.com (Postfix) with SMTP id 2C9454D809E for <bug-libwww-perl [...] rt.cpan.org>; Thu, 15 May 2008 15:22:28 -0400 (EDT)
Received: (qmail 24690 invoked from network); 15 May 2008 19:22:27 -0000
Received: from x16.dev (10.0.100.26) by x1.dev with QMQP; 15 May 2008 19:22:27 -0000
Received: from bay0-omc2-s17.bay0.hotmail.com (HELO bay0-omc2-s17.bay0.hotmail.com) (65.54.246.153) by 16.mx.develooper.com (qpsmtpd/0.43rc1) with ESMTP; Thu, 15 May 2008 12:22:23 -0700
Received: from BAY116-W1 ([64.4.38.101]) by bay0-omc2-s17.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 15 May 2008 12:22:19 -0700
Delivered-To: cpan-bug+libwww-perl [...] diesel.bestpractical.com
Subject: error with decoded_content
Return-Path: <jjperss [...] hotmail.com>
X-Original-To: bug-libwww-perl [...] rt.cpan.org
X-Spam-Check-BY: 16.mx.develooper.com
Date: Thu, 15 May 2008 21:22:19 +0200
X-Spam-Level: *
Message-Id: <BAY116-W157655017FE2E643812FCA8C90 [...] phx.gbl>
X-Originating-Ip: [130.227.138.130]
To: <bug-libwww-perl [...] rt.cpan.org>
From: Jesper Jørgen Persson <jjperss [...] hotmail.com>
Content-Length: 0
content-type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-RT-Original-Encoding: iso-8859-1
Content-Length: 506
Download (untitled) / with headers
text/plain 506b
This page: http://www.mariagerfjord.dk/mfk/politik/raad_naevn/aeldreraad/dokumenter/07_01_31/index.html is decoded wrong. The page contains a meta tag with charset utf-8, but decoded_content chooses the fallback encoding: ISO-8859-1. As far as I can debug, the problem lies within HTML::HeadParser. Best Regards Jesper Persson Show quoted text
_________________________________________________________________ News, entertainment and everything you care about at Live.com. Get it now! http://www.live.com/getstarted.aspx
content-type: text/html; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-RT-Original-Encoding: iso-8859-1
Content-Length: 800
MIME-Version: 1.0
In-Reply-To: <BAY116-W157655017FE2E643812FCA8C90 [...] phx.gbl>
X-Mailer: MIME-tools 5.426 (Entity 5.426)
Content-Disposition: inline
Charset: utf8
References: <BAY116-W157655017FE2E643812FCA8C90 [...] phx.gbl>
Message-Id: <rt-3.6.HEAD-23017-1216839375-1131.35912-0-0 [...] rt.cpan.org>
Content-Type: text/plain
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
Content-Length: 228
Download (untitled) / with headers
text/plain 228b
It looks like it's the UTF-8 BOM that confuses it. The decoded_content() should probably also be to look for it, so that it manages to decode the page correctly even when it has no charset parameter for the Content-Type header.
MIME-Version: 1.0
In-Reply-To: <BAY116-W157655017FE2E643812FCA8C90 [...] phx.gbl>
X-Mailer: MIME-tools 5.426 (Entity 5.426)
Content-Disposition: inline
Charset: utf8
References: <BAY116-W157655017FE2E643812FCA8C90 [...] phx.gbl>
Message-Id: <rt-3.6.HEAD-5560-1226932582-1146.35912-0-0 [...] rt.cpan.org>
Content-Type: text/plain
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
Content-Length: 82
HTML-Parser-3.58 has now been uploaded to CPAN. I think it should fix this issue.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.