Skip Menu |
 

This queue is for tickets about the libwww-perl CPAN distribution.

Report information
The Basics
Id: 42396
Status: resolved
Priority: 0/
Queue: libwww-perl

People
Owner: Nobody in particular
Requestors: m-uchino [...] yetipapa.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)

Attachments
0001-Make-format_request-ensure-that-it-returns-bytes.patch



Subject: Posted binary-data is broken
Date: Wed, 14 Jan 2009 17:45:46 +0900
To: <bug-libwww-perl [...] rt.cpan.org>
From: "uchino" <m-uchino [...] yetipapa.com>
Download (untitled) / with headers
text/plain 1.5k
(Sorry, my English is poor.) Binary-data which posted by LWP::UserAgent with SSL is broken. This problem occurs when using SSL and UTF-8-Flag of added header is set to ON. [EXAMPLE] ---------------------------------------------------------------------- ---- #!/usr/local/bin/perl use strict; use utf8; # ******** (1) use LWP::UserAgent; use HTTP::Request::Common 'POST'; my $ua = LWP::UserAgent->new(); my $http_res = $ua->request(POST 'https://myhost/post.cgi', # ******** (2) Content_Type => 'form-data', Content => [ bin_data => ['./image.gif'], ], Head1 => 'A', # ******** (3) ); $http_res->is_success or die $http_res->message; print "OK\n"; exit; ---------------------------------------------------------------------- ---- [post.cgi CHECK SCRIPT] ---------------------------------------------------------------------- ---- #!/usr/local/bin/perl use strict; my $buffer; binmode STDIN, ':raw'; read(STDIN, $buffer, $ENV{CONTENT_LENGTH}); open my $fh, '>:raw', './request_body.dat' or die($!); print $fh $buffer; close $fh; print "Content-type: text/plain\n\nOK"; exit; ---------------------------------------------------------------------- ---- A header-name is 'Head1'(3). UTF-8-Flag of this text is set to ON by 'use utf8'(1). Post to SSL site(2). Check the file(request_body.dat) by binary-editor. Data and boundary are broken. 'Head1' is ASCII characters, but it's not-quoted and UTF-8-Flag is set to ON. If it's quoted, this problem don't occur. 'Head1' => 'A', # ******** (3) This problem in Crypt::SSLeay? Perl v5.8.8 libwww v5.823 Crypt::SSLeay v0.57
Download (untitled) / with headers
text/plain 574b
Thanks for your report. Have you verified that this problem goes away if you don't use 'https' (and Crypt-SSLeay)? LWP should probably be more careful if UTF8 encoded strings make it into the header values. If you set the content of a request to a non-downgradable UTF8 string it will croak, but it does not guard the headers. The last mystery here is why perl marks non-quoted strings as UTF8. The following sample code demonstrates: use utf8; use Devel::Peek; Dump([foo => 'bar']); # 'foo' becomes an UTF8 string use Devel::Peek; Dump(['foo' => 'bar']);
From: m-uchino [...] yetipapa.com
Download (untitled) / with headers
text/plain 1.7k
Thank you for your reply. Show quoted text
> Have you verified that this problem goes away if you don't use 'https'
(and Crypt-SSLeay)? Yes, I tried post to non-SSL-site(http), then this problem didn't occur. And, I rename file 'Net/SSL.pm' to 'Net/xxxSSL.pm', and I ran the following scripts. ----------------------------------- #!/usr/local/bin/perl use strict; use utf8; use LWP::UserAgent; use HTTP::Request::Common 'POST'; my $ua = LWP::UserAgent->new(); my $http_res = $ua->request(POST 'https://myhost/post.cgi', Content_Type => 'form-data', Content => [ bin_data => ['./image.gif'], ], Head1 => 'A', ); $http_res->is_success or die $http_res->message; print $Net::HTTPS::SSL_SOCKET_CLASS . "\n"; # ********* for check SSL module print "OK\n"; exit; ----------------------------------- I verified that printed 'IO::Socket::SSL'. And, result was the same. I understand that I should not include UTF-8 in a request. The one of the problems that are hard to be found is as follows. ----------------------------------- #!/usr/local/bin/perl use strict; use utf8; use HTML::Form; my $alpha = "\x{3b1}"; # ************ UTF-8 my $html = <<"EOM"; <form method="post" action="./post.cgi" enctype="multipart/form-data"> <input type="text" name="field_1" value="" /> ........ $alpha ....... : : </form> EOM my ($form) = HTML::Form->parse($html, 'https://myhost/') or die 'parse'; my $request = $form->make_request; print 'UTF-8 Flag: ' . (utf8::is_utf8($request->header('Content_Type')) ? 'ON' : 'OFF') . "\n"; exit; ----------------------------------- I spent several days till I find that the cause of the problem is UTF-8... Is the best method to avoid this problem to downgrade all request-headers just before a post?
Download (untitled) / with headers
text/plain 1.6k
On Wed Jan 14 13:01:18 2009, m-uchino@yetipapa.com wrote: Show quoted text
> Thank you for your reply. >
> > Have you verified that this problem goes away if you don't use 'https'
> (and Crypt-SSLeay)? > > Yes, I tried post to non-SSL-site(http), then this problem didn't occur.
The data is sent using by calling syswrite() method on the IO::Socket object. Normally this would downgrade the strings to bytes, but apparently this does not happen in the syswrite() implementation of Crypt-SSLeay. I think it's a good idea to make LWP force this before it calls syswrite. The attached patch should address this. Show quoted text
> I understand that I should not include UTF-8 in a request. > The one of the problems that are hard to be found is as follows. > ----------------------------------- > #!/usr/local/bin/perl > > use strict; > use utf8; > > use HTML::Form; > > my $alpha = "\x{3b1}"; # ************ UTF-8 > my $html = <<"EOM"; > <form method="post" action="./post.cgi" enctype="multipart/form-data"> > <input type="text" name="field_1" value="" /> > ........ $alpha ....... > : > : > </form> > EOM > > my ($form) = HTML::Form->parse($html, 'https://myhost/') or die 'parse'; > my $request = $form->make_request; > > print 'UTF-8 Flag: ' . (utf8::is_utf8($request->header('Content_Type')) > ? 'ON' : 'OFF') . "\n"; > > exit; > ----------------------------------- > I spent several days till I find that the cause of the problem is UTF-8... > Is the best method to avoid this problem to downgrade all > request-headers just before a post?
Again, I think that the patch will address the issue in this situation, but we still have issues if the form fields themselves contain wide UTF-8.
From 787516a62fc34caec5950b34f3925950844f34d9 Mon Sep 17 00:00:00 2001 From: Gisle Aas <gisle@aas.no> Date: Wed, 14 Jan 2009 22:09:45 +0100 Subject: [PATCH] Make format_request() ensure that it returns bytes [RT#42396] The method now croaks if passed characters that can't be downgraded to bytes. --- lib/Net/HTTP/Methods.pm | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/lib/Net/HTTP/Methods.pm b/lib/Net/HTTP/Methods.pm index 9704c6c..bb5fad7 100644 --- a/lib/Net/HTTP/Methods.pm +++ b/lib/Net/HTTP/Methods.pm @@ -173,7 +173,13 @@ sub format_request { push(@h2, "Host: $h") if $h; } - return join($CRLF, "$method $uri HTTP/$ver", @h2, @h, "", $content); + my $req = join($CRLF, "$method $uri HTTP/$ver", @h2, @h, "", $content); + return $req unless defined &utf8::downgrade; + unless (utf8::downgrade($req, 1)) { + require Carp; + Carp::croak("Wide character in HTTP request (bytes required)"); + } + return $req; } -- 1.6.1.28.gc32f76
From: m-uchino [...] yetipapa.com
Download (untitled) / with headers
text/plain 543b
I patched file and I ran script. Then the problem did not happen. Problem was gone! Thank you very much! Show quoted text
> Again, I think that the patch will address the issue in this
situation, but we still have issues if Show quoted text
> the form fields themselves contain wide UTF-8.
I think that programmers are careful to form fields or headers which oneself sets, but item set automatically(such as 'Content-Type' via HTML::Form) is not so. Therefore, many cases will be relieved by your patch. Thank you for your help and your patch and your splendid software.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.