Subject: | piconv: wrong conversion of utf-16le encoded files (with PATCH) |
MIME-Version: | 1.0 |
X-Mailer: | MIME-tools 5.418 (Entity 5.418) |
X-RT-Original-Encoding: | utf-8 |
Content-Type: | multipart/mixed; boundary="----------=_1152282944-15470-1" |
Content-Length: | 0 |
Content-Type: | text/plain; charset="utf8" |
Content-Disposition: | inline |
Content-Transfer-Encoding: | binary |
Content-Length: | 640 |
This seems to be a duplicate of bug #7831, but it also occurs on
non-Win32 systems (i.e. Linux 2.4). There are two problems:
* The default conversion schemes assume $/ as line separator. This is
not true for utf-16 (and probably other multibyte encodings like ucs-4);
here a newline is two bytes. The perlio conversion scheme does not
suffer from this problem.
* Unfortunately it's not possible to switch the conversion scheme with
-scheme perlio because of a upper/lower typo in the source.
I suggest to change the default conversion scheme to perlio, and the
above mentioned bug is also solved in the attached patch.
Regards,
Slaven
Subject: | piconv.patch |
MIME-Version: | 1.0 |
Content-Type: | multipart/mixed; boundary="----------=_1152282944-15470-0" |
X-Mailer: | MIME-tools 5.418 (Entity 5.418) |
Content-Length: | 0 |
Content-Type: | text/plain; charset="utf8" |
Content-Disposition: | inline |
Content-Transfer-Encoding: | binary |
X-RT-Original-Encoding: | utf-8 |
Content-Length: | 0 |
Content-Type: | application/octet-stream; name="piconv.patch" |
Content-Disposition: | inline; filename="piconv.patch" |
Content-Transfer-Encoding: | base64 |
Content-Length: | 665 |
--- bin/piconv.orig 2006-07-07 16:28:12.000000000 +0200
+++ bin/piconv 2006-07-07 16:28:26.000000000 +0200
@@ -40,7 +40,7 @@ $Opt{from} || $Opt{to} || help();
my $from = $Opt{from} || $locale or help("from_encoding unspecified");
my $to = $Opt{to} || $locale or help("to_encoding unspecified");
$Opt{string} and Encode::from_to($Opt{string}, $from, $to) and print $Opt{string} and exit;
-my $scheme = exists $Scheme{$Opt{Scheme}} ? $Opt{Scheme} : 'from_to';
+my $scheme = exists $Scheme{$Opt{scheme}} ? $Opt{scheme} : 'perlio';
$Opt{check} ||= $Opt{c};
$Opt{perlqq} and $Opt{check} = Encode::PERLQQ;
$Opt{htmlcref} and $Opt{check} = Encode::HTMLCREF;