This queue is for tickets about the Text-CSV_XS CPAN distribution.

Report information
The Basics
Id:
120655
Status:
resolved
Priority:
Low/Low
Queue:

People
Owner:
Nobody in particular
Requestors:
felix.ostmann [...] gmail.com
Cc:
AdminCc:

BugTracker
Severity:
Critical
Broken in:
0.91
Fixed in:
1.28



Subject: bind_columns with strange behavior / length() from old value
This is perl 5, version 18, subversion 1 (v5.18.1) built for x86_64-linux ---- Hello, here is a small script, which produce a strange behavior using length(). The problem only kicks in when using: * bind_columns * a empty field * a unicode character in the previous row for the empty field ---- SKRIPT: ---- #!/usr/bin/env perl $|++; use strict; use File::Temp qw(tempfile); use Text::CSV_XS; my $temp_fh = IO::File->new_tmpfile; $temp_fh->print(<<CSV); field1,field2 pröblem,ignore ,ignore CSV $temp_fh->seek(0, 0); $temp_fh->binmode(':utf8'); my $order_csv = Text::CSV_XS->new(); my $row; { my $row_header = $order_csv->getline($temp_fh); $order_csv->bind_columns(\@{$row}{@$row_header}); } while ($order_csv->getline($temp_fh)) { printf( "STRING: >%s< ; LENGTH: %d ; HOTFIX LENGTH: %d\n", $row->{field1}, length($row->{field1}), length("".$row->{field1}), ); } ---- OUTPUT: ---- STRING: >pröblem< ; LENGTH: 7 ; HOTFIX LENGTH: 7 STRING: >< ; LENGTH: 7 ; HOTFIX LENGTH: 0
Subject: Re: [rt.cpan.org #120655] bind_columns with strange behavior / length() from old value
Date: Sun, 19 Mar 2017 19:30:07 +0100
To: bug-Text-CSV_XS@rt.cpan.org
From: "H.Merijn Brand" <h.m.brand@xs4all.nl>
On Sun, 19 Mar 2017 09:55:26 -0400, "Felix Antonius Wilhelm Ostmann via RT" <bug-Text-CSV_XS@rt.cpan.org> wrote:
Show quoted text
> here is a small script, which produce a strange behavior using > length (). The problem only kicks in when using: > * bind_columns > * a empty field > * a unicode character in the previous row for the empty field
Your problem most likely lies in the strongly discouraged use of ":utf8" See: --8<--- use strict; use warnings; use File::Temp qw(tempfile); use Text::CSV_XS; my $temp_fh = IO::File->new_tmpfile; $temp_fh->print (<<"CSV"); field1,field2 pröblem,ignore ,ignore CSV $temp_fh->seek (0, 0); $temp_fh->binmode (":utf8"); my $order_csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 }); my $row; { my $row_header = $order_csv->getline ($temp_fh); $order_csv->bind_columns (\@{$row}{@$row_header}); } while ($order_csv->getline ($temp_fh)) { printf "STRING: >%s< ; LENGTH: %d ; HOTFIX LENGTH: %d\n", $row->{field1}, length $row->{field1}, length "".$row->{field1}; } -->8--- => utf8 "\xF6" does not map to Unicode at xx.pl line 25, <_GEN_0> line 2. Wide character in printf at xx.pl line 29, <_GEN_0> line 2. STRING: >pr�blem< ; LENGTH: 4 ; HOTFIX LENGTH: 4 STRING: >< ; LENGTH: 4 ; HOTFIX LENGTH: 0 but with the correct use of encoding: $temp_fh->binmode (":encoding(utf-8)"); => utf8 "\xF6" does not map to Unicode at xx.pl line 22. STRING: >pr\xF6blem< ; LENGTH: 10 ; HOTFIX LENGTH: 10 STRING: >< ; LENGTH: 0 ; HOTFIX LENGTH: 0 I'd suggest you stop using ":utf8" per direct and start using the safe way with ":encoding(utf-8)". Anyway, this doesn't look like a CSV_XS problem -- H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/ using perl5.00307 .. 5.25 porting perl5 on HP-UX, AIX, and openSUSE http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/ http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/

Message body not shown because it is not plain text.

From: felix.ostmann@gmail.com
Am So 19. Mär 2017, 14:31:42, h.m.brand@xs4all.nl schrieb:
Show quoted text
> On Sun, 19 Mar 2017 09:55:26 -0400, "Felix Antonius Wilhelm Ostmann via > RT" <bug-Text-CSV_XS@rt.cpan.org> wrote: >
> > here is a small script, which produce a strange behavior using > > length (). The problem only kicks in when using: > > * bind_columns > > * a empty field > > * a unicode character in the previous row for the empty field
> > Your problem most likely lies in the strongly discouraged use of ":utf8" > > See: > --8<--- > use strict; > use warnings; > > use File::Temp qw(tempfile); > use Text::CSV_XS; > > my $temp_fh = IO::File->new_tmpfile; > $temp_fh->print (<<"CSV"); > field1,field2 > pröblem,ignore > ,ignore > CSV > $temp_fh->seek (0, 0); > $temp_fh->binmode (":utf8"); > > my $order_csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 }); > > my $row; > { my $row_header = $order_csv->getline ($temp_fh); > $order_csv->bind_columns (\@{$row}{@$row_header}); > } > while ($order_csv->getline ($temp_fh)) { > printf "STRING: >%s< ; LENGTH: %d ; HOTFIX LENGTH: %d\n", > $row->{field1}, > length $row->{field1}, > length "".$row->{field1}; > } > -->8--- > > => > > utf8 "\xF6" does not map to Unicode at xx.pl line 25, <_GEN_0> line 2. > Wide character in printf at xx.pl line 29, <_GEN_0> line 2. > STRING: >pr�blem< ; LENGTH: 4 ; HOTFIX LENGTH: 4 > STRING: >< ; LENGTH: 4 ; HOTFIX LENGTH: 0 > > but with the correct use of encoding: > > $temp_fh->binmode (":encoding(utf-8)"); > > => > > utf8 "\xF6" does not map to Unicode at xx.pl line 22. > STRING: >pr\xF6blem< ; LENGTH: 10 ; HOTFIX LENGTH: 10 > STRING: >< ; LENGTH: 0 ; HOTFIX LENGTH: 0 > > I'd suggest you stop using ":utf8" per direct and start using the safe > way with ":encoding(utf-8)". > > Anyway, this doesn't look like a CSV_XS problem > >
Sorry, my script was ofcourse misleading without the charset information for the script. ---- I can use 'binary => 1' from Text::CSV_XS or 'binmode(":encoding(utf-8)")' from IO::File, both result in the error: If this is not a bug for Text::CSV_XS, please guide me to the correct point. ---- #!/usr/bin/env perl use Text::CSV_XS; my $temp_fh = IO::File->new_tmpfile; $temp_fh->print(<<CSV); field1 pr\x{C3}\x{96}blem CSV $temp_fh->seek(0, 0); my $csv = Text::CSV_XS->new({binary => 1}); my $row; { my $row_header = $csv->getline($temp_fh); $csv->bind_columns(\@{$row}{@$row_header}); } while ($csv->getline($temp_fh)) { printf( "STRING: >%s< ; LENGTH: %d ; HOTFIX LENGTH: %d\n", $row->{field1}, length($row->{field1}), length("".$row->{field1}), ); } ---- #!/usr/bin/env perl use Text::CSV_XS; my $temp_fh = IO::File->new_tmpfile; $temp_fh->print(<<CSV); field1 pr\x{C3}\x{96}blem CSV $temp_fh->seek(0, 0); $temp_fh->binmode('encoding(utf8)'); my $csv = Text::CSV_XS->new(); my $row; { my $row_header = $csv->getline($temp_fh); $csv->bind_columns(\@{$row}{@$row_header}); } while ($csv->getline($temp_fh)) { printf( "STRING: >%s< ; LENGTH: %d ; HOTFIX LENGTH: %d\n", $row->{field1}, length($row->{field1}), length("".$row->{field1}), ); }
Subject: Re: [rt.cpan.org #120655] bind_columns with strange behavior / length() from old value
Date: Mon, 20 Mar 2017 08:32:05 +0100
To: bug-Text-CSV_XS@rt.cpan.org
From: "H.Merijn Brand" <h.m.brand@xs4all.nl>
On Sun, 19 Mar 2017 18:32:24 -0400, "Felix Antonius Wilhelm Ostmann via RT" <bug-Text-CSV_XS@rt.cpan.org> wrote:
Show quoted text
> Sorry, my script was ofcourse misleading without the charset > information for the script. > > ---- > > I can use 'binary => 1' from Text::CSV_XS or > 'binmode(":encoding(utf-8)")' from IO::File, both result in the error: > > If this is not a bug for Text::CSV_XS, please guide me to the correct > point.
So, It reduces to this reproducible case: --8<--- use 5.18.2; use warnings; use Text::CSV_XS; open my $fh, "<:encoding(utf-8)", \"c1\npr\x{c3}\x{b6}blem\n\n"; binmode STDOUT, ":encoding(utf-8)"; my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 }); my %row; $csv->bind_columns (\@row{@{$csv->getline ($fh)}}); while ($csv->getline ($fh)) { printf "STRING: >%s< ; LENGTH: %d ; HOTFIX LENGTH: %d\n", $row{c1}, length $row{c1}, length "".$row{c1}; } -->8--- => STRING: >pröblem< ; LENGTH: 7 ; HOTFIX LENGTH: 7 STRING: >< ; LENGTH: 7 ; HOTFIX LENGTH: 0 I'll try to find the cause -- H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/ using perl5.00307 .. 5.25 porting perl5 on HP-UX, AIX, and openSUSE http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/ http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/

Message body not shown because it is not plain text.

I've found a fix, but it looks like there might also be a bug in the core. I've asked the core people for comment. The fix is pushed, but I want to make new tests for this before I release. Thanks for pointing me to this mishap and taking the time to stay with me. Feel free to pull and test the fix
From: felix.ostmann@gmail.com
Am Mo 20. Mär 2017, 04:45:39, HMBRAND schrieb:
Show quoted text
> I've found a fix, but it looks like there might also be a bug in the > core. I've asked the core people for comment. > > The fix is pushed, but I want to make new tests for this before I > release. > > Thanks for pointing me to this mishap and taking the time to stay with > me. > > Feel free to pull and test the fix
1.28 fixed the bug! Thanks for your help!
Show quoted text
> > Feel free to pull and test the fix
> > 1.28 fixed the bug! Thanks for your help!
Release will have to wait, as now tests fail with perl-5.6.1 Will have to dig for the cause


This service runs on Request Tracker, is sponsored by The Perl Foundation, and maintained by Best Practical Solutions.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.