Skip Menu |

This queue is for tickets about the Unicode-Normalize CPAN distribution.

Report information
The Basics
Id: 53197
Status: resolved
Priority: 0/
Queue: Unicode-Normalize

Owner: Nobody in particular
Requestors: CFAERBER [...]

Bug Information
Severity: Important
Broken in: 1.03
Fixed in: (no value)

Subject: NFKC("\x{2000}") produces "\x20\x05" on some perls >= 5.11.2
Hi. Sometimes, NFKC("\x{2000}") produces an extra "\x05" in the output.

This problem does seem to be isolated to some platforms. It has been observed with U::N 1.03
running some amd64 operating system; I'm not sure whether it also occurs with U::N 1.05.

Please find a test case attached.

Subject: perl-5.11.2.t
Download perl-5.11.2.t
text/x-perl 166b
use strict; use utf8; no warnings 'utf8'; use Test::More tests => 1; use Unicode::Normalize(); is( Unicode::Normalize::NFKC("\x{2000}"), " ", 'NFKC of U+2000' );
As I've discovered the problem with test vectors for Net::IDN::Encode/Unicode::Stringprep, some CPAN tests are available here: (these two are the most interesting versions, please ignore the experimental versions 1.09_2009????)

The problem occurs in these tests as (N.B. the ^E is not visible):
#   Failed test 'Non-ASCII multibyte space character U+2000'
#   at t/nameprep_st.t line 258.
#          got: ' '
#     expected: ' '

#   Failed test 'Larger test (shrinking)'
#   at t/nameprep_st.t line 258.
#          got: 'xssi̇telǰ aΰ '
#     expected: 'xssi̇telǰ aΰ '

The prime suspect is now the generated file lib/unicode/ in bleadperl:

2000		2002
2001		2003
2002	2006	 0020 # [5]
2007		 0020
2008	200A	 0020 # [3]
Probably there's no fix required for Unicode::Normalize. I'll write a patch for perl, then.
It's fixed in bleadperl/5.11.4

This service is sponsored and maintained by Best Practical Solutions and runs on infrastructure.

Please report any issues with to