Skip Menu |
 

This queue is for tickets about the File-Slurp CPAN distribution.

Report information
The Basics
Id: 84918
Status: resolved
Priority: 0/
Queue: File-Slurp

People
Owner: cwhitener [...] gmail.com
Requestors: corion [...] corion.net
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: 9999.25



From corion [...] corion.net Mon Apr 29 15: 50:44 2013
MIME-Version: 1.0
X-Spam-Status: No, score=-6.55 tagged_above=-99.9 required=10 tests=[AWL=0.350, BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5] autolearn=ham
X-Spam-Flag: NO
content-type: text/plain; charset="utf-8"; format="flowed"
Message-ID: <517ECEFA.7030708 [...] corion.net>
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
X-Spam-Score: -6.55
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id ADEB4240798 for <cpan-bug+File-Slurp [...] hipster.bestpractical.com>; Mon, 29 Apr 2013 15:50:44 -0400 (EDT)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id M0eWqxTaDgDT for <cpan-bug+File-Slurp [...] hipster.bestpractical.com>; Mon, 29 Apr 2013 15:50:41 -0400 (EDT)
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id 980CC24028A for <bug-File-Slurp [...] rt.cpan.org>; Mon, 29 Apr 2013 15:50:39 -0400 (EDT)
Received: (qmail 30108 invoked by alias); 29 Apr 2013 19:50:38 -0000
Received: from mail.corion.net (HELO mail.corion.net) (46.163.73.47) by la.mx.develooper.com (qpsmtpd/0.28) with ESMTP; Mon, 29 Apr 2013 12:50:31 -0700
Received: from port-92-193-102-240.dynamic.qsc.de ([92.193.102.240] helo=aliens.maischein-int.de) by mail.corion.net with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from <corion [...] corion.net>) id 1UWu5i-0005nR-EB for bug-File-Slurp [...] rt.cpan.org; Mon, 29 Apr 2013 21:50:26 +0200
Received: from [192.168.1.17] by aliens.maischein-int.de with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.69) (envelope-from <corion [...] corion.net>) id 1UWu5h-0003W1-Uc for bug-File-Slurp [...] rt.cpan.org; Mon, 29 Apr 2013 21:50:25 +0200
Delivered-To: cpan-bug+File-Slurp [...] hipster.bestpractical.com
Subject: read_file() ignores binmode option for short files
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130328 Thunderbird/17.0.5
Return-Path: <corion [...] corion.net>
X-RT-Mail-Extension: file-slurp
X-Original-To: cpan-bug+File-Slurp [...] hipster.bestpractical.com
X-Spam-Check-BY: la.mx.develooper.com
Date: Mon, 29 Apr 2013 21:50:18 +0200
X-Spam-Level:
To: bug-File-Slurp [...] rt.cpan.org
Content-Transfer-Encoding: 7bit
From: Max Maischein <corion [...] corion.net>
X-RT-Original-Encoding: iso-8859-15
X-RT-Interface: Email
Content-Length: 762
Download (untitled) / with headers
text/plain 762b
Hello, thanks for writing File::Slurp. I noticed a bug in File::Slurp which leads to bad data being read. The binmode option is ignored in the code path for short files. Especially when reading and writing text files on Windows using {binmode => ':raw'}, but also when processing UTF-8 files, this is quite bad. The quick workaround is to simply delete that wrong optimization at the start of read_file(). If you want to keep the code path for short files, you will have to come up with your own way of reimplementing IO layers, or at least detect :raw and likely :utf-8 layers and act on them appropriately. Especially the line to "fix" Windows input does not seem prudent: $buf =~ s/\015\012/\n/g if $is_win32 ; Thanks for looking at this, -max
MIME-Version: 1.0
In-Reply-To: <517ECEFA.7030708 [...] corion.net>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
X-RT-Interface: Web
References: <517ECEFA.7030708 [...] corion.net>
Content-Type: multipart/mixed; boundary="----------=_1540073465-10014-2"
Message-ID: <rt-4.0.18-10014-1540073465-1771.84918-0-0 [...] rt.cpan.org>
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 0
Content-Disposition: inline
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
Content-Length: 714
Download (untitled) / with headers
text/plain 714b
Show quoted text
> I noticed a bug in File::Slurp which leads to bad data being read. The > binmode option is ignored in the code path for short files. Especially > when reading and writing text files on Windows using {binmode => > ':raw'}, but also when processing UTF-8 files, this is quite bad.
Reviewing this bug, the problem is not in the short path, but in the long path, which does not cope with read_file( 'file.txt', { binmode => ':crlf' }); or read_file( 'file.txt', { binmode => ':encoding(Latin-1)' }); on Windows, due to the hand-rolled "fixup" of newlines under the assumption that all binmode arguments need to trigger this. I've attached a test file for this. The tests fail under Windows currently.
MIME-Version: 1.0
Subject: newline_binmode.t
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Type: application/octet-stream; name="newline_binmode.t"
Content-Disposition: inline; filename="newline_binmode.t"
Content-Transfer-Encoding: base64
Content-Length: 1352
Download newline_binmode.t
text/x-perl 1.3k
use strict; use warnings; use IO::Handle (); use File::Basename (); use File::Spec (); use lib File::Spec->catdir(File::Spec->rel2abs(File::Basename::dirname(__FILE__)), 'lib'); use FileSlurpTest qw(temp_file_path); use File::Slurp qw(read_file write_file); use Test::More; plan tests => 6; my $binmode; for (':encoding(Latin-1)', ':crlf', ':raw') { $binmode = $_; my $data = "\n\n\n"; my $file_name = temp_file_path(); stdio_write_file($file_name, $data); my $slurped_data = read_file($file_name, { binmode => $binmode }); my $stdio_slurped_data = stdio_read_file( $file_name ) ; print 'data ', unpack( 'H*', $data), "\n", 'slurp ', unpack('H*', $slurped_data), "\n", 'stdio slurp ', unpack('H*', $stdio_slurped_data), "\n"; is($data, $slurped_data, "slurp ($binmode)"); write_file($file_name, { binmode => $binmode }, $data ); $slurped_data = stdio_read_file($file_name); is($data, $slurped_data, "spew ($binmode)"); unlink $file_name; }; sub stdio_write_file { my ($file_name, $data) = @_; open (my $fh, '>', $file_name) || die "Couldn't create $file_name: $!"; binmode $fh, $binmode; $fh->print($data); } sub stdio_read_file { my ($file_name) = @_; open (my $fh, '<', $file_name ) || die "Couldn't open $file_name: $!"; binmode $fh, $binmode; local $/; my $data = <$fh>; return $data; }
MIME-Version: 1.0
In-Reply-To: <517ECEFA.7030708 [...] corion.net>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <517ECEFA.7030708 [...] corion.net>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-17095-1549152500-1748.84918-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 108
Download (untitled) / with headers
text/plain 108b
Hi Corion, I believe this to be resolved in the current release that refactored read_file(). Thanks, Chase
MIME-Version: 1.0
X-Spam-Status: No, score=-5.9 tagged_above=-99.9 required=10 tests=[BAYES_00=-1.9, FROM_OUR_RT=-4] autolearn=ham
In-Reply-To: <rt-4.0.18-17095-1549152500-300.84918-6-0 [...] rt.cpan.org>
X-Cpan.org: This message routed through the cpan.org mail forwarding service. Please use PAUSE pause.perl.org to configure your delivery settings.
X-Spam-Flag: NO
X-RT-Interface: API
References: <RT-Ticket-84918 [...] rt.cpan.org> <517ECEFA.7030708 [...] corion.net> <rt-4.0.18-17095-1549152500-300.84918-6-0 [...] rt.cpan.org>
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
Message-ID: <87b4abc2-fffe-ab77-81e8-8695cd05b321 [...] corion.net>
content-type: text/plain; charset="utf-8"
Autocrypt: addr=corion [...] corion.net; keydata= mQENBFrXmq0BCAC8tVHD8D6GmWsA1uOWnxHeCnNClnMRnCa4qoxhAkbZo7Zq1XHedT7zWC2m UxKEYyctcAy824BZLnbBFg7AgczUqDnsiDRlbE3udHIWC0537V30HPhi4ouGnmuU2EKAzR5C kYD8HHJmSnFX6S/JSvZRvQWKmlx3s3+0nlb1TkWXJG30B/VGc5BN2TXWxczc5IYuoT9c6uVG LpxGX+FrVXAuvnNzS3dgbGwuOI+PJ/Zk6VcaOXKNuZ1PDiHsAKaAZ6UNbgUHLHJEu62oMGnd ofueYp3vLFH6T8PKEaBIkjcry1iuCGU8mUZ+hTVEeyQk01u/aAwdVGbBzjQofoDkxJajABEB AAG0IU1heCBNYWlzY2hlaW4gPGNvcmlvbkBjb3Jpb24ubmV0PokBVAQTAQgAPhYhBGe327th RTZnEA7EWm37uUQQTybOBQJa15qxAhsDBQkB4TOABQsJCAcCBhUKCQgLAgQWAgMBAh4BAheA AAoJEG37uUQQTybOSP8IAIH2H/fgA3bGLAw/XuTGkYeZcxFY8Xt5pPWoePPsxZCgm2BbefKn jo2Bz7mZmImF59CQQ/g8Bt7gviuNGuaiYRknZS4t0EZE6ZAyQXbI4vNFB2dKz52uZ24X2l4D 106dQiA7Q7LlJxJ2Q4j/+JOAh0dn4oCbRgsoZl0Io4kCJyzGD+h0fIp7J4GcrCxL/24+QO29 VNuGT/l9wk872St+eldVXTTezaJcP+aNpE1eDdO8yTosdwtXNZKTnfo3xNfSX4jW9LSP+r3x UPQQ//wIe718jDCRzPm6cpp8+S3qXlknXWQhxm7L9tV7LFLDbO2jpuU+TK5BsKGogh7V6eq3 hR65AQ0EWtearwEIALDanXG3DcT/P5c+44Xdq8KcwxaBss77zShawUWpRk/YNFC3688v0P5n rvbVSQ1jqBtYJjwx4yVEcpkWL5njVWhNAHaPufZbl9vFp7Qn4BCJMcWzNES8cDe1fwrjzY/l 62d6G8qKzBsWxuSY+SOFAY55yIWFnZQET1e33JWAzaW2uVfMbNzWfAhZ3OBGgOyIP3nKLRDr 6ALgz2E6WitBBdqLoYTRwypSOasIURWFNhLdp7HiiVhybCmFJRzLFnEUQkpkJtifEh4DUeyS HN2SVx2+Vfbusif4MpDt/FK4vpflT4KudHVRV5/zZ2QKLbvaFb4+fMeo4nqAf/V7Kl2a8HUA EQEAAYkBPAQYAQgAJhYhBGe327thRTZnEA7EWm37uUQQTybOBQJa15qvAhsMBQkB4TOAAAoJ EG37uUQQTybOQdoIAKiru1tUAqUqKKqglt6NzJe/rCXbtTBF0og6xKGqWRwJo6w+N2hBOwVU OS0IgudlPFQgb72IT7Zi+zNFjTZzsgBSF84+4PibRqOu3rCtFiidB9PW42X/85ElunaHlUeD cE7zPzOLqTJutMQqj4w/larC4uf2zO6yAx6Nwd/XfkcsP5amXNL3cItELYia8FryNVEFzBer 2pZKMtvVPn1tkWKqXRX0GMqdfDjxfZFP3KTPCjPTHJMOJiLaOETF4qIdXGEhcan4alhN0Utb easL3/vqal9dumq8kwe1DEtbSpO9eLgKNdjR40hXfnYKYbizrQcd73pdHvfXwzEaw5U8bDI=
X-RT-Original-Encoding: utf-8
X-Spam-Score: -5.9
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id 327A224022D for <cpan-bug+File-Slurp [...] hipster.bestpractical.com>; Sun, 3 Feb 2019 01:40:45 -0500 (EST)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xie3BB-XthSN for <cpan-bug+File-Slurp [...] hipster.bestpractical.com>; Sun, 3 Feb 2019 01:40:43 -0500 (EST)
Received: from xx1.develooper.com (xx1.develooper.com [207.171.7.115]) by hipster.bestpractical.com (Postfix) with ESMTPS id 3B1D32401C9 for <bug-File-Slurp [...] rt.cpan.org>; Sun, 3 Feb 2019 01:40:41 -0500 (EST)
Received: from localhost (xx1.develooper.com [127.0.0.1]) by localhost (Postfix) with ESMTP id EB5B57CF89 for <bug-File-Slurp [...] rt.cpan.org>; Sat, 2 Feb 2019 22:40:39 -0800 (PST)
Received: from xx1.develooper.com (xx1.develooper.com [127.0.0.1]) by localhost (Postfix) with SMTP id E65E37CF88 for <bug-File-Slurp [...] rt.cpan.org>; Sat, 2 Feb 2019 22:40:37 -0800 (PST)
Received: from mail.corion.net (mail.corion.net [83.169.23.242]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by xx1.develooper.com (Postfix) with ESMTPS id 6AC597C1C5 for <bug-File-Slurp [...] rt.cpan.org>; Sat, 2 Feb 2019 22:40:36 -0800 (PST)
Received: from p4fe896f0.dip0.t-ipconnect.de ([79.232.150.240] helo=aliens.maischein.home) by mail.corion.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <corion [...] corion.net>) id 1gqBSD-0000CZ-SX for bug-File-Slurp [...] rt.cpan.org; Sun, 03 Feb 2019 07:40:33 +0100
Received: from cabininthewoods.maischein.home ([192.168.1.92]) by aliens.maischein.home with esmtp (Exim 4.89) (envelope-from <corion [...] corion.net>) id 1gqBSD-00012k-8k for bug-File-Slurp [...] rt.cpan.org; Sun, 03 Feb 2019 07:40:33 +0100
Delivered-To: cpan-bug+File-Slurp [...] hipster.bestpractical.com
Subject: Re: [rt.cpan.org #84918] read_file() ignores binmode option for short files
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.0
Return-Path: <corion [...] corion.net>
X-Original-To: cpan-bug+File-Slurp [...] hipster.bestpractical.com
X-RT-Mail-Extension: file-slurp
Openpgp: preference=signencrypt
Date: Sun, 3 Feb 2019 07:40:33 +0100
X-PMX-Spam: Gauge=IIIIIIII, Probability=8%, Report=' FRAUD_ATTACH 0.05, HTML_00_01 0.05, HTML_00_10 0.05, BODYTEXTP_SIZE_3000_LESS 0, BODY_SIZE_1000_LESS 0, BODY_SIZE_2000_LESS 0, BODY_SIZE_5000_LESS 0, BODY_SIZE_600_699 0, BODY_SIZE_7000_LESS 0, IN_REP_TO 0, LEGITIMATE_SIGNS 0, MSG_THREAD 0, REFERENCES 0, SPF_NONE 0, URI_ENDS_IN_HTML 0, URI_WITH_PATH_ONLY 0, __ANY_URI 0, __BOUNCE_CHALLENGE_SUBJ 0, __BOUNCE_NDR_SUBJ_EXEMPT 0, __CP_URI_IN_BODY 0, __CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __DQ_NEG_HEUR 0, __DQ_NEG_IP 0, __FORWARDED_MSG 0, __HAS_FROM 0, __HAS_MSGID 0, __HTTPS_URI 0, __IN_REP_TO 0, __MIME_TEXT_ONLY 0, __MIME_TEXT_P 0, __MIME_TEXT_P1 0, __MIME_VERSION 0, __MOZILLA_USER_AGENT 0, __MULTIPLE_URI_TEXT 0, __NO_HTML_TAG_RAW 0, __REFERENCES 0, __SANE_MSGID 0, __SUBJ_ALPHA_END 0, __SUBJ_ALPHA_NEGATE 0, __SUBJ_REPLY 0, __TO_MALFORMED_2 0, __TO_NO_NAME 0, __URI_IN_BODY 0, __URI_NOT_IMG 0, __URI_NO_MAILTO 0, __URI_NO_WWW 0, __URI_NS , __URI_WITH_PATH 0, __USER_AGENT 0, __blackholes.mail-abuse.org_ERROR , __zen.spamhaus.org_ERROR '
X-Spam-Level:
X-PMX-Version: 5.6.1.2065439, Antispam-Engine: 2.7.2.376379, Antispam-Data: 2019.2.3.63017
To: bug-File-Slurp [...] rt.cpan.org
Content-Transfer-Encoding: 7bit
From: Max Maischein <corion [...] corion.net>
RT-Message-ID: <rt-4.0.18-19702-1549176047-332.84918-0-0 [...] rt.cpan.org>
Content-Length: 625
Download (untitled) / with headers
text/plain 625b
Hello Chase, thank you very much for working on File::Slurp! Unfortunately, the problem is not fixed (see test attached to this ticket and to https://github.com/perhunter/slurp/pull/19 . The new version does not cope properly with {binmode => ':encoding(Latin-1)'} for example, because it does _not_ apply the :crlf handling in that situation when normal reading would. -max Am 03.02.2019 um 01:08 schrieb Chase Whitener via RT: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=84918 > > > Hi Corion, > > I believe this to be resolved in the current release that refactored read_file(). > > Thanks, > Chase >


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.