Skip Menu |
 

This queue is for tickets about the DBD-SQLite CPAN distribution.

Report information
The Basics
Id: 25371
Status: resolved
Priority: 0/
Queue: DBD-SQLite

People
Owner: Nobody in particular
Requestors: juerd [...] convolution.nl
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: (no value)
Fixed in: (no value)



Subject: Asymmetric UTF-8 support causes malformed data
Date: Sun, 11 Mar 2007 13:31:24 +0100
To: bug-DBD-SQLite [...] rt.cpan.org
From: Juerd Waalboer <juerd [...] convolution.nl>
Download (untitled) / with headers
text/plain 1.3k
DBD::SQLite has Unicode support, in the sense that if $dbh->{unicode} is true, it will set the UTF8 flag on (almost) all data coming from the database. This feature is incredibly useful, but only if all of your database actually does contain UTF-8 encoded strings. However, DBD::SQLite does not ensure that data going into the database really is encoded as UTF-8. Because of Perl's Unicode model, the internal encoding for strings is either UTF-8 or ISO-8859-1. This is supposed to be transparent to the user. DBD::SQLite uses whatever the internal encoding was. When later this data is pulled from the database, it gets the UTF8 flag enabled, while it might have been ISO-8859-1. The result is crashing programs because of malformed UTF-8 characters. A workaround for users of the current DBD::SQLite is to manually Encode::encode_utf8 utf8::upgrade every string sent to the database. A more permanent fix could be implemented by DBD::SQLite's author, by upgrading all strings internally, ensuring that their encoding is indeed UTF-8. This can be done in a mutating or a copying way, and no encoding takes place if the UTF8 flag was already on. They are sv_utf8_upgrade and bytes_to_utf8 respectively. See also L<perlguts/"How do I convert a string to UTF-8?"> in current bleadperl. The query itself, and all placeholder values, should get this treatment. Since UTF-8 is ASCII-compatible, this has no effect on the SQL syntax.
Download (untitled) / with headers
text/plain 333b
On Zo. maa. 11 08:34:35 2007, juerd@convolution.nl wrote: Show quoted text
> A workaround for users of the current DBD::SQLite is to manually > Encode::encode_utf8 utf8::upgrade every string sent to the database.
While encode_utf8 works now, it would break as soon as DBD::SQLite is fixed. So utf8::upgrade all data going to the database. -- Juerd
From: spamcollector_cpan [...] juerd.nl
Download (untitled) / with headers
text/plain 129b
Any news? I think it is important to fix this bug, because malformed UTF-8 data can crash programs, even at a distance. -- Juerd
Download (untitled) / with headers
text/plain 117b
Are you able to put together a test script for us that demonstrates this bug? This would make it easier to replicate.
Download (untitled) / with headers
text/plain 935b
On Sun Apr 05 15:47:05 2009, ADAMK wrote: Show quoted text
> Are you able to put together a test script for us that demonstrates this > bug? This would make it easier to replicate.
3;0 juerd@feather:~$ cat foo.t use strict; use warnings; use Test::More; use File::Temp qw(tempfile); use DBI; my @strings = ("\0", "A", "\xe9", "\x{20ac}"); plan tests => scalar @strings; my ($fh, $fn) = tempfile; my $dbh = DBI->connect("dbi:SQLite:$fn"); $dbh->{unicode} = 1; $dbh->do("CREATE TABLE foo (foo)"); for (@strings) { $dbh->do("INSERT INTO foo VALUES (?)", undef, $_); my $foo = $dbh->selectall_arrayref("SELECT foo FROM foo"); is $foo->[0][0], $_; $dbh->do("DELETE FROM foo"); } 3;0 juerd@feather:~$ perl foo.t 1..4 ok 1 ok 2 not ok 3 # Failed test at foo.t line 20. Wide character in print at /usr/share/perl5/Test/Builder.pm line 1351. # got: '�' # expected: 'é' ok 4 # Looks like you failed 1 test of 4. -- Juerd
Show quoted text
> use File::Temp qw(tempfile);
Unlinking the temporary files is left as an exercise :) -- Juerd
Download (untitled) / with headers
text/plain 157b
I've converted this to use our test shortcuts and committed it as t/rt_25371_asymmetric_unicode.t. It still fails, but now at least it's officially failing.
Resolved in 1.22


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.