Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the DBD-mysql CPAN distribution.

Report information
The Basics

Nobody in particular
dragon31337 [...]
pali [...]

Subject: UTF8 strings have no utf8 flag set
I can't clearly figure out if is dupe, seems like it is not. Others were not touched for years already.
Unicode strings has no utf8 flag set. So strings are encoded in utf8, but there is no utf8 flag.
I've created a unicode db and a table in it, see dbinit.sql for mo detail.
Repro script in attach.

All strings are valid utf8 strings but without utf flag. These strings go to output incorrectly unless I update it myself.

Note: all utf8  flags set for outputs, etc...

perl version:  v5.10.1, build 1006 [291086]
DBD-mysql: 4.011
DBI: 1.609
OS: Windows 7 32bit [Version 6.1.7600]
MySQL: 5.1.37 win32 on localhost

Subject: dbinit.sql

Message body not shown because it is not plain text.

use Data::Dumper; use strict; use DBI; use utf8; use Encode; binmode STDOUT, ":utf8"; binmode STDIN, ":utf8"; my ($host, $port, $database, $user, $password, $rise) = ('localhost','3306','flights','root','itsme'); my $dsn = "DBI:mysql:host=$host;port=$port;". "database=$database;". "mysql_compression=1;". "mysql_client_found_rows=1;". "mysql_auto_reconnect=1;". "mysql_enable_utf8=1;"; my $dbh = DBI->connect( $dsn, $user, $password, { RaiseError => 1, } ); $dbh->do("SET character_set_client = utf8;"); $dbh->do("SET character_set_connection = utf8;"); $dbh->do("SET character_set_results = utf8;"); my $query = $dbh->prepare("SELECT * FROM countries"); $query->execute(); my $data = $query->fetchall_arrayref(); foreach my $str (@$data) { foreach my $val (@$str) { print ("Value: $val, it is ".(utf8::is_utf8($val)?"":"non-")."utf8 string, and ".(utf8::valid($str)?"":"not")." valid<br>\n"); Encode::_utf8_on($val); print ("Value: $val, it is ".(utf8::is_utf8($val)?"":"non-")."utf8 string, and ".(utf8::valid($str)?"":"not")." valid<br>\n"); } } $query->finish(); $dbh->disconnect(); return $data;
I have another kind of trouble here. WHat do you think about the patch I supply here? I was trying your to know out if that is your trouble, too: But it looks like not.
Subject: repro-utf8-2011-01-27.diff
--- 2011-01-27 17:31:29.000000000 +0300 +++ 2011-01-27 17:33:19.000000000 +0300 @@ -1,9 +1,7 @@ use Data::Dumper; use strict; use DBI; -use utf8; use Encode; -binmode STDOUT, ":utf8"; binmode STDIN, ":utf8"; @@ -33,11 +31,10 @@ foreach my $val (@$str) { print ("Value: $val, it is ".(utf8::is_utf8($val)?"":"non-")."utf8 string, and ".(utf8::valid($str)?"":"not")." valid<br>\n"); - Encode::_utf8_on($val); + utf8::upgrade( $val ); print ("Value: $val, it is ".(utf8::is_utf8($val)?"":"non-")."utf8 string, and ".(utf8::valid($str)?"":"not")." valid<br>\n"); } } $query->finish(); $dbh->disconnect(); - return $data;
Subject: MySQL driver does not handle UTF strings properly
Yes this is a bug. Mistakenly reported it to MSSQL before. It does not get the same values it writes. Here is test case: use DBI; use Data::Dumper; use strict; use utf8; binmode STDOUT, ":utf8"; my $h = DBI->connect('dbi:mysql:database=xxx;host=server99', 'xxx', 'xxxxx') or die("Cannot connect to MySQL database: ", $DBI::errstr); $h->do('SET NAMES utf8'); eval { $h->do(q/drop table mje/); }; $h->do(q/create table mje (a nvarchar(20))/); my $unicode = "\x{e9} é \x{20ac}"; print $unicode, ', ', utf8::is_utf8($unicode), ', ', Dumper($unicode); $h->do(q/insert into mje values(?)/, undef, $unicode); my $s = $h->prepare(q/select * from mje/); $s->execute; my $f = $s->fetchall_arrayref; my $x = $f->[0]->[0]; # utf8::decode($x); print $x, ', ', utf8::is_utf8($x), ', ', (map { sprintf('%02X ', ord($_)) } split (//, $x)), ', ', Dumper($f), "\n"; exit; ----------------------- This is the output ├⌐ ├⌐ Γé¼, 1, $VAR1 = "\x{e9} \x{e9} \x{20ac}"; ├â┬⌐ ├â┬⌐ ├ó┬é┬¼, , C3 A9 20 C3 A9 20 E2 82 AC , $VAR1 = [ [ '├â┬⌐ ├â┬⌐ ├ó┬é┬¼' ] ]; ------------------------ You must forgive the line drawing characters because it was using ActiveState Perl on Windows, which cannot print even hardcoded Unicode to the console, nor can Strawbery Perl. Only Cygwin Perl seems to display Unicode properly (but I cannot get DBD::MySQL to compile with Cygwin). The important point to notice is that '1' value, indicating is_utf(), and the hexadecimal representation of é, which is E9 in Unicode, and C3 A9 in UTF8. The first Dumper output is correct. On the 2nd line, notice is_utf() does not return a value of '1', and that the hexadecimal value of the string is broken down into a UTF8 byte stream. It was not decoded properly. If I use utf8::decode() on the value returned from the database, then it works ok. Without the UTF8 decoding, Perl is assuming the multiple bytes C3 A9 which represents é and should be combined into Unicode E9 are separate Unicode characters C3 and another character A9, which is completely not what was expected. Here is additional information -------------------------- (Terminal is set to UTF8 character set.)
Show quoted text
mysql> select a,hex(a) from mje;
+-----------+--------------------+ | a | hex(a) | +-----------+--------------------+ | é é € | C3A920C3A920E282AC | +-----------+--------------------+
Show quoted text
mysql> status
-------------- mysql Ver 14.12 Distrib 5.0.60, for pc-linux-gnu (i686) using readline 5.2 Server version: 5.0.60-log Gentoo Linux mysql-5.0.60-r1 Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: utf8 Db characterset: utf8 Client characterset: utf8 Conn. characterset: utf8 List of Unicode characters and their UTF8 hex values
Additional information: C:\Users\xxxxxxx\Documents\xxxx-serverscripts>perl -MDBI -e "DBI-
Show quoted text
Perl : 5.010001 (MSWin32-x64-multi-thread) OS : MSWin32 (5.2) DBI : 1.615 DBD::mysql : 4.018 Similar but misfiled bug:
Fix for UTF-8 support in DBD::mysql is in my pull request: I would like if more people affected by UTF-8 bugs in DBD::mysql could test my changes...
Reopening, fix was reverted in 4.043.

This service runs on Request Tracker, is sponsored by The Perl Foundation, and maintained by Best Practical Solutions.

Please report any issues with to