Skip Menu |
 

This queue is for tickets about the Net-IDN-Encode CPAN distribution.

Report information
The Basics
Id: 103205
Status: rejected
Priority: 0/
Queue: Net-IDN-Encode

People
Owner: CFAERBER [...] cpan.org
Requestors: ab [...] lixutec.net
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 2.201
Fixed in: (no value)



Subject: conversion of domain name xn--zcaa.de
MIME-Version: 1.0
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
Message-ID: <rt-4.0.18-3342-1427725207-125.0-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 356
Download (untitled) / with headers
text/plain 356b
the conversion of the IDN domain name "xn--zcaa.de" to unicode is broken: The expected result would be "ßß.de". But using the domain_to_unicode() method a wrongly encoded string is returned. The issue only appears if the domain name only contains "ß" characters. If the domain name also contains other characters the returned unicode string is correct.
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-3342-1427725207-125.0-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-3342-1427725207-125.0-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-28466-1427744263-73.103205-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 380
Download (untitled) / with headers
text/plain 380b
Thank you for reporting this bug. I cannot reproduce it on perl 5.16 with the latest version of Net-IDN-Encode. Which version of perl and Net-IDN-Encode are you using. However, I also have an idea what might be the cause. Could you please test the developer release at https://metacpan.org/release/CFAERBER/Net-IDN-Encode-2.201_20150330 and check whether the problem persists?
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-28466-1427744263-73.103205-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: API
References: <rt-4.0.18-3342-1427725207-125.0-0-0 [...] rt.cpan.org> <rt-4.0.18-28466-1427744263-73.103205-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-3443-1427791850-1399.0-0-0 [...] rt.cpan.org>
Message-ID: <rt-4.0.18-3443-1427791850-1493.103205-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
From: ab [...] lixutec.net
Content-Length: 294
Download (untitled) / with headers
text/plain 294b
Show quoted text
> I cannot reproduce it on perl 5.16 with the latest version of Net-IDN- > Encode. Which version of perl and Net-IDN-Encode are you using.
I'm using two perl versions on different servers: v5.10.1 and v5.14.2 the version of Net-IDN-Encode is 2.201 I will try the development version soon...
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-28466-1427744263-73.103205-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: API
References: <rt-4.0.18-3342-1427725207-125.0-0-0 [...] rt.cpan.org> <rt-4.0.18-28466-1427744263-73.103205-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-22459-1427796758-214.0-0-0 [...] rt.cpan.org>
Message-ID: <rt-4.0.18-22459-1427796758-1797.103205-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
From: ab [...] lixutec.net
Content-Length: 281
Download (untitled) / with headers
text/plain 281b
Show quoted text
> However, I also have an idea what might be the cause. Could you please > test the developer release at > https://metacpan.org/release/CFAERBER/Net-IDN-Encode-2.201_20150330 > and check whether the problem persists?
The problem still exists with the latest developer release.
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-28466-1427744263-73.103205-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: API
References: <rt-4.0.18-3342-1427725207-125.0-0-0 [...] rt.cpan.org> <rt-4.0.18-28466-1427744263-73.103205-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-13552-1427801078-27.0-0-0 [...] rt.cpan.org>
Message-ID: <rt-4.0.18-13552-1427801078-1948.103205-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
From: ab [...] lixutec.net
Content-Length: 513
Download (untitled) / with headers
text/plain 513b
I wrote a short test script to isolate the issue: #!/usr/bin/perl use Net::IDN::Encode; print STDERR "VERSION: " . $Net::IDN::Encode::VERSION . "\n"; use Net::IDN::Encode ':all'; my $u = domain_to_unicode('xn--zcaa.de'); utf8::encode($u) if utf8::is_utf8($u); print STDERR "UNICODE: " . $u . "\n"; $u = domain_to_unicode('xn--m-qfaaa.de'); utf8::encode($u) if utf8::is_utf8($u); print STDERR "UNICODE: " . $u . "\n"; the output is: VERSION: 2.2012015033 UNICODE: ��.de UNICODE: mßßß.de
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-13552-1427801078-27.0-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: API
References: <rt-4.0.18-3342-1427725207-125.0-0-0 [...] rt.cpan.org> <rt-4.0.18-28466-1427744263-73.103205-0-0 [...] rt.cpan.org> <rt-4.0.18-13552-1427801078-27.0-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-18034-1427805451-609.0-0-0 [...] rt.cpan.org>
Message-ID: <rt-4.0.18-18034-1427805451-1473.103205-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
From: ab [...] lixutec.net
Content-Length: 174
Download (untitled) / with headers
text/plain 174b
Show quoted text
> the output is: > VERSION: 2.2012015033 > UNICODE: ��.de > UNICODE: mßßß.de
BTW: using an older version works: VERSION: 2.003 UNICODE: ßß.de UNICODE: mßßß.de
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-18034-1427805451-609.0-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-3342-1427725207-125.0-0-0 [...] rt.cpan.org> <rt-4.0.18-28466-1427744263-73.103205-0-0 [...] rt.cpan.org> <rt-4.0.18-13552-1427801078-27.0-0-0 [...] rt.cpan.org> <rt-4.0.18-18034-1427805451-609.0-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-14677-1427845290-816.103205-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 452
Download (untitled) / with headers
text/plain 452b
Your problem is this line: utf8::encode($u) if utf8::is_utf8($u); This is wrong. If your display requires UTF-8, you need to encode both strings encoded in UTF-X (is_utf8 is true) as well as strings encoded as bytes (is_utf8 is false). For output, you can also use e.g binmode STDERR, ':utf8'; I'm don't think that Net::IDN::Encode should guarantee the status of the utf8 flag on *_to_unicode. Normally, this flag is handled seamlessly by perl.
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-14677-1427845290-816.103205-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-3342-1427725207-125.0-0-0 [...] rt.cpan.org> <rt-4.0.18-28466-1427744263-73.103205-0-0 [...] rt.cpan.org> <rt-4.0.18-13552-1427801078-27.0-0-0 [...] rt.cpan.org> <rt-4.0.18-18034-1427805451-609.0-0-0 [...] rt.cpan.org> <rt-4.0.18-14677-1427845290-816.103205-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-30207-1428001339-1553.103205-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 349
Download (untitled) / with headers
text/plain 349b
Please also note the following from perlunifaq: ------------------------------------------------------------------------------- What is "the UTF8 flag"? Please, unless you're hacking the internals, or debugging weirdness, don't think about the UTF8 flag at all. That means that you very probably shouldn't use is_utf8 , _utf8_on or _utf8_off at all.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.