Skip Menu |
 

This queue is for tickets about the XML-LibXML CPAN distribution.

Report information
The Basics
Id: 40818
Status: resolved
Priority: 0/
Queue: XML-LibXML

People
Owner: Nobody in particular
Requestors: daniel.frett [...] ccci.org
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: 1.68
Fixed in: (no value)



Subject: getAttributeNS not working on utf-8 encoded xml
MIME-Version: 1.0
X-Mailer: MIME-tools 5.426 (Entity 5.426)
Charset: utf8
X-RT-Original-Encoding: utf-8
Content-Type: multipart/mixed; boundary="----------=_1226349351-5586-67"
Content-Length: 0
Content-Type: text/plain
Content-Disposition: inline
Content-Transfer-Encoding: binary
Content-Length: 317
Download (untitled) / with headers
text/plain 317b
getAttributeNS isn't working on UTF-8 encoded documents, attached is a patch for t/10ns.t which adds tests that illustrate the bug. This bug was introduced somewhere between 1.66 and 1.68. I have been looking over the svn commits between the 2 versions, but haven't found anything that could cause this yet. -Daniel
Subject: libxml.patch
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="----------=_1226349351-5586-66"
X-Mailer: MIME-tools 5.426 (Entity 5.426)
Charset: utf8
Content-Length: 0
Content-Type: text/plain
Content-Disposition: inline
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: iso-8859-1
Content-Length: 0
Content-Type: application/octet-stream; name="libxml.patch"
Content-Disposition: inline; filename="libxml.patch"
Content-Transfer-Encoding: base64
Content-Length: 1093
Download libxml.patch
text/x-diff 1k
Index: 10ns.t =================================================================== --- 10ns.t (revision 755) +++ 10ns.t (working copy) @@ -1,6 +1,6 @@ # -*- cperl -*- use Test; -BEGIN { plan tests=>124; } +BEGIN { plan tests=>129; } use XML::LibXML; use XML::LibXML::Common qw(:libxml); @@ -146,6 +146,17 @@ ok ( $root->getAttributeNodeNS('http://example.com','attr') ); ok ( $root->getAttributeNS('http://example.com','attr'), 'value' ); ok ( $root->getAttributeNode('xxx:attr')->getNamespaceURI(), 'http://example.com'); + + #change encoding to UTF-8 and retest + $doc->setEncoding('UTF-8'); + # namespaced attributes + $root->setAttribute('xxx:attr', 'value'); + ok ( $root->getAttributeNode('xxx:attr') ); + ok ( $root->getAttribute('xxx:attr'), 'value' ); + print $root->toString(1),"\n"; + ok ( $root->getAttributeNodeNS('http://example.com','attr') ); + ok ( $root->getAttributeNS('http://example.com','attr'), 'value' ); + ok ( $root->getAttributeNode('xxx:attr')->getNamespaceURI(), 'http://example.com'); } print "# 8. changing namespace declarations\n";
MIME-Version: 1.0
X-Mailer: MIME-tools 5.426 (Entity 5.426)
Charset: utf8
Message-Id: <rt-3.6.HEAD-5586-1226353331-1071.40818-0-0 [...] rt.cpan.org>
Content-Type: multipart/mixed; boundary="----------=_1226353331-5586-71"
From: daniel.frett [...] ccci.org
X-RT-Original-Encoding: utf-8
Content-Length: 0
Content-Disposition: inline
Content-Type: text/plain
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
Content-Length: 614
Download (untitled) / with headers
text/plain 614b
On Mon Nov 10 15:35:57 2008, dfrett wrote: Show quoted text
> getAttributeNS isn't working on UTF-8 encoded documents, attached is a > patch for t/10ns.t which adds tests that illustrate the bug. This bug > was introduced somewhere between 1.66 and 1.68. > > I have been looking over the svn commits between the 2 versions, but > haven't found anything that could cause this yet. > > -Daniel
I tracked down the issue, in the new version of the PmmFastDecodeString function the strlen wasn't being calculated for utf-8 strings. Attached is a patch with the updated test and the fixed PmmFastDecodeString function. -Daniel Frett
MIME-Version: 1.0
X-Mailer: MIME-tools 5.426 (Entity 5.426)
Content-Type: multipart/mixed; boundary="----------=_1226353331-5586-70"
Charset: utf8
Content-Length: 0
Content-Type: text/plain
Content-Disposition: inline
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: iso-8859-1
Content-Length: 0
Content-Type: application/octet-stream; name="libxml-utf8-fix.patch"
Content-Disposition: inline; filename="libxml-utf8-fix.patch"
Content-Transfer-Encoding: base64
Content-Length: 1585
Index: perl-libxml-mm.c =================================================================== --- perl-libxml-mm.c (revision 755) +++ perl-libxml-mm.c (working copy) @@ -1017,7 +1017,8 @@ } if ( charset == XML_CHAR_ENCODING_UTF8 ) { - return xmlStrdup( string ); + retval = xmlStrdup( string ); + *len = xmlStrlen(retval); } else if ( charset == XML_CHAR_ENCODING_ERROR ){ coder = xmlFindCharEncodingHandler( (const char *) encoding ); Index: t/10ns.t =================================================================== --- t/10ns.t (revision 755) +++ t/10ns.t (working copy) @@ -1,6 +1,6 @@ # -*- cperl -*- use Test; -BEGIN { plan tests=>124; } +BEGIN { plan tests=>129; } use XML::LibXML; use XML::LibXML::Common qw(:libxml); @@ -146,6 +146,17 @@ ok ( $root->getAttributeNodeNS('http://example.com','attr') ); ok ( $root->getAttributeNS('http://example.com','attr'), 'value' ); ok ( $root->getAttributeNode('xxx:attr')->getNamespaceURI(), 'http://example.com'); + + #change encoding to UTF-8 and retest + $doc->setEncoding('UTF-8'); + # namespaced attributes + $root->setAttribute('xxx:attr', 'value'); + ok ( $root->getAttributeNode('xxx:attr') ); + ok ( $root->getAttribute('xxx:attr'), 'value' ); + print $root->toString(1),"\n"; + ok ( $root->getAttributeNodeNS('http://example.com','attr') ); + ok ( $root->getAttributeNS('http://example.com','attr'), 'value' ); + ok ( $root->getAttributeNode('xxx:attr')->getNamespaceURI(), 'http://example.com'); } print "# 8. changing namespace declarations\n";
MIME-Version: 1.0
X-Mailer: MIME-tools 5.426 (Entity 5.426)
Content-Disposition: inline
Charset: utf8
Message-Id: <rt-3.6.HEAD-5597-1226430017-621.40818-0-0 [...] rt.cpan.org>
Content-Type: text/plain
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
Content-Length: 103
Download (untitled) / with headers
text/plain 103b
Thanks a lot for the patch. Applied in SVN. I think I'll be shipping a new version very soon. -- Petr


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.