Skip Menu |
 

This queue is for tickets about the CPAN-Changes CPAN distribution.

Report information
The Basics
Id: 88036
Status: open
Priority: 0/
Queue: CPAN-Changes

People
Owner: Nobody in particular
Requestors: KENTNL [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 0.23
Fixed in: (no value)



Subject: No UTF8 Support obvious
MIME-Version: 1.0
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
Message-ID: <rt-4.0.16-22103-1377199992-1462.0-0-0 [...] rt.cpan.org>
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 943

The Spec and the code itself seem to be reasonably ignorant of UTF8 issues.

I just discovered this in the process of munging my own CPAN::Changes files, as an extension I have in progress ( CPAN::Changes::Markdown ) was sensitive to specific things in The Changes file.

Specifically, in my changes file, I regularly use 0xA0 ( Non-breaking space character ) and → ( 0x2192 ), which if not read in utf8 mode, become a chatoic mess of bytes that don't match regular expressions , like you're matching vs "\xC2\xA0\xE2\x86\x92\xC2\xA0" instead of " → "

The easiest approach here is to either have a load_utf8 method, or a load_filehandle() method that takes a Path::Tiny::path('foo')->openr_utf8  or something.

Alternatively, you could try to have unicode by default, but not sure how good an idea that is.

 

MIME-Version: 1.0
In-Reply-To: <rt-4.0.16-22103-1377199992-1462.0-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.16-22103-1377199992-1462.0-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.16-12252-1378729853-1493.88036-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 293
Download (untitled) / with headers
text/plain 293b
Le 2013-08-22 21:33:12, KENTNL a écrit : Show quoted text
> The Spec and the code itself seem to be reasonably ignorant of UTF8 > issues.
The problem is more general: the spec does not specify encoding at all. And we are not leaving in a pure ASCII world. -- Olivier Mengué - http://perlresume.org/DOLMEN
MIME-Version: 1.0
In-Reply-To: <rt-4.0.16-12252-1378729853-1493.88036-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.16-22103-1377199992-1462.0-0-0 [...] rt.cpan.org> <rt-4.0.16-12252-1378729853-1493.88036-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.16-8588-1378730569-1573.88036-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 396
Download (untitled) / with headers
text/plain 396b
Le 2013-09-09 14:30:53, DOLMEN a écrit : Show quoted text
> Le 2013-08-22 21:33:12, KENTNL a écrit :
> > The Spec and the code itself seem to be reasonably ignorant of UTF8 > > issues.
> > The problem is more general: the spec does not specify encoding at > all. And we are not leaving in a pure ASCII world.
Just opened ticket 88540 for encoding support. -- Olivier Mengué - http://perlresume.org/DOLMEN
MIME-Version: 1.0
In-Reply-To: <rt-4.0.16-22103-1377199992-1462.0-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.16-22103-1377199992-1462.0-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-24055-1432363595-722.88036-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 213
Download (untitled) / with headers
text/plain 213b
The next release will attempt to decode either UTF-8 or Latin-1 when using the ->load method. The ->load_string method will continue to accept decoded strings. I'd like to include this in the spec in the future.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.