Skip Menu |

This queue is for tickets about the Text-Unidecode CPAN distribution.

Report information
The Basics
Id: 99227
Status: new
Priority: 0/
Queue: Text-Unidecode

Owner: Nobody in particular
Requestors: lcom [...]

Bug Information
Severity: Normal
Broken in: 0.04
Fixed in: (no value)

Subject: Correction to the documentation, under "Caveats"
MIME-Version: 1.0
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
Message-ID: <rt-4.0.18-9670-1412089837-1575.0-0-0 [...]>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 693
Download (untitled) / with headers
text/plain 693b
In the documentation under "Caveats", it appears that the phrase "make sure that the input data really is a utf8 string" is incorrect. Note that utf8 is a variable-length encoding, whereas Text::Unidecode wants a fixed length (two-byte) encoding for each character. To fix this, you could phrase it as "make sure that the input data really is a string of two-byte Unicode characters". This is also referred to as UCS-2 in case you want to include that moniker. How about if we also provide a tip on how to convert strings which really are utf8. You would do it like so: my $decode_status = utf8::decode($input_to_be_converted); my $converted_string = unidecode ($input_to_be_converted);

This service is sponsored and maintained by Best Practical Solutions and runs on infrastructure.

Please report any issues with to