Skip Menu | You are currently an anonymous guest. | Login | Return to Main | About rt.cpan.org
 

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.

X Report information
Id: 24835
Status: resolved
Left: 0 min
Priority: 0/0
Queue: libnet

Owner: Nobody
Requestors: SAPER <SAPER [...] cpan.org>
Cc:
AdminCc:

Severity: Important
Broken in: 1.20
Fixed in: (no value)

X Attachments



X History Display mode: Brief headersFull headers
#   Wed Feb 07 13:30:48 2007 SAPER - Ticket created  
Subject: Net::Cmd 2.27 (libnet 1.20) incorrectly upgrades everything to UTF-8
[text/plain 1.4k]
Hello Graham,

I'm afraid libnet 1.20 has introduced a quite important bug for people
using characters outside ASCII: unconditionally calling utf8::encode()
on any data passed to datasend() has the side effect to convert
everything to UTF-8, even when it's not expected to be. As a result,
all the accented characters appear as the usual Unicode junk: "é"
becomes "é", "ê" becomes "ê" and the like.

Therefore, perfectly valid programs that were sending correct mails
will send rubbish as soon as libnet is upgraded to version 1.20.

As suggested in ticket#18589, there may be a need to pass additional
parameter to libnet modules in order to indicate the encoding
(although I'm not sure if one can assume that encoding stays the
same through an entire mail or if it can change from one part to
another).

In the mean time, I'd suggest to remove the call to utf8::encode()
from Net::Cmd:

--- lib/Net/Cmd.pm.orig 2006-10-27 13:08:07.000000000 +0200
+++ lib/Net/Cmd.pm 2007-02-07 19:27:54.328532000 +0100
@@ -21,8 +21,6 @@
}
}

-my $doUTF8 = eval { require utf8 };
-
$VERSION = "2.27";
@ISA = qw(Exporter);
@EXPORT = qw(CMD_INFO CMD_OK CMD_MORE CMD_REJECT CMD_ERROR CMD_PENDING);
@@ -395,8 +393,6 @@
my $arr = @_ == 1 && ref($_[0]) ? $_[0] : \@_;
my $line = join("" ,@$arr);

- utf8::encode($line) if $doUTF8;
-
return 0 unless defined(fileno($cmd));

my $last_ch = ${*$cmd}{'net_cmd_last_ch'};



Best Regards,

--
Close the world, txEn eht nepO.
#   Thu Feb 08 16:10:27 2007 ben[...]cpanel.net - Correspondence added  
From: ben[...]cpanel.net
[text/plain 236b]
Additionally utf8::encode is not available with Perl 5.6.2, though the utf8 require will succeed.

[root[...]localhost root]# perl -Mutf8 -le 'print $]; utf8::encode("hello");'
5.006002
Undefined subroutine utf8::encode called at -e line 1

#   Thu Feb 08 16:10:29 2007 RT_System - Status changed from 'new' to 'open'  
#   Wed Mar 14 05:45:47 2007 RGARCIA - Correspondence added  
From: RGARCIA[...]cpan.org
[text/plain 523b]
On Wed Feb 07 13:30:48 2007, SAPER wrote:
> As suggested in ticket#18589, there may be a need to pass additional
> parameter to libnet modules in order to indicate the encoding
> (although I'm not sure if one can assume that encoding stays the
> same through an entire mail or if it can change from one part to
> another).

In this case you could probably encode it upstream.

> In the mean time, I'd suggest to remove the call to utf8::encode()
> from Net::Cmd:

I've applied this patch to bleadperl as change #30576.
#   Fri May 18 17:38:43 2007 Okko - Correspondence added  
From: oskari.ojala[...]frantic.com
[text/plain 1.2k]
Hello,

Confirming the bug, it is affecting us. I also e-mailed Graham about this.

On Wed Feb 07 13:30:48 2007, SAPER wrote:

> As suggested in ticket#18589, there may be a need to pass additional
> parameter to libnet modules in order to indicate the encoding
> (although I'm not sure if one can assume that encoding stays the
> same through an entire mail or if it can change from one part to
> another).

This is not a good idea, a multi-part MIME message can have each part in
a different character set. People shouldn't be passing on strings to
Net::CMD anyway, they should be passing octets (variables with the
internal utf8 flag off).

And as there is the internal perl flag for utf8ness I don't see any
reason to pass on encoding as a parameter.

Net::CMD could just die if the flag is on if you want to be strict.
If you don't then the line

utf8::encode($line) if $doUTF8;

should/could be replaced with:

if ($doUTF8) {
# encode to individual utf8 bytes if
# $line is a string (in internal UTF-8)
utf8::encode($line) if utf8::is_utf8($line);
}

to fix the bug with latin-1 and to do what people probably expect if
they would feed UTF8 lines to it.



For reference:
http://perldoc.perl.org/utf8.html
http://www.perlmonks.org/?displaytype=print;node_id=551676

#   Wed May 30 13:43:41 2007 SAPER - Correspondence added  
From: SAPER[...]cpan.org
[text/plain 1k]
Hello,

Attached is a script that illustrate the bug.
Use it like this:

$ perl cpan-rt-24835.pl <smtp> <address>

where <smtp> is the name of your SMTP server and <address> your email
address.

When the line 25 is commented, the characters stay as they are and
are correctly transmitted as ISO-Latin-1. Uncomment it and the
characters (which, if I understand correctly, have already been
internally upgraded to utf8 because of the string coming from the
XML document) are now sent as raw bytes with libnet version 1.20 and
1.21, while they were correctly sent (probably after a smart/magic
downgrade) as Latin1 with previous versions.

A solution is to use Encode::encode() to manually downgrade the
string coming from XML to Latin-1. But the fact remains that
perfectly valid code which was working till libnet-1.20 came out
must now do additional work to send correct data. IMHO, I consider
this as a bug, or at the very least as an incompatible change that
should be documented.


Best Regards

--
Close the world, txEn eht nepO.

[application/octet-stream 1.1k]
Message body not shown because it is too large or is not plain text.
#   Wed May 30 14:53:47 2007 gbarr[...]pobox.com - Correspondence added  
Subject: Re: [rt.cpan.org #24835] Net::Cmd 2.27 (libnet 1.20) incorrectly upgrades everything to UTF-8
Date: Wed, 30 May 2007 13:26:27 -0500 (CDT)
To: bug-libnet[...]rt.cpan.org
From: "Graham Barr" <gbarr[...]pobox.com>
[text/plain 32b]
Fixed in libnet 1.21

Graham.



#   Tue Sep 11 08:26:06 2007 SAPER - Correspondence added  
From: SAPER[...]cpan.org
[text/plain 267b]
Hello Graham,

I just tested libnet 1.22 and this version seems to correctly
work (i.e. does not mangle ISO-8859-1 characters).

I'm installing it on a production server in order to check that
the mails are correctly created.

--
Close the world, txEn eht nepO.
#   Sat Feb 09 09:48:49 2008 GBARR - Status changed from 'open' to 'resolved'