Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the XML-Parser CPAN distribution.

Report information
The Basics
Id:
128006
Status:
resolved
Priority:
Low/Low
Queue:

People
Owner:
Nobody in particular
Requestors:
ppisar [...] redhat.com
Cc:
AdminCc:

BugTracker
Severity:
(no value)
Broken in:
2.44
Fixed in:
(no value)

Attachments
0001-Fix-a-buffer-overwrite-in-parse_stream.patch



Subject: A buffer overwrite in parse_stream() with wide characters on stdint
If the xpath tool gets an UTF-8 encoded file with some non-ASCII characters on its standard input, it can experience a buffer overwrite in in parse_stream() function when copying data read and decoded by PerlIO's read() method to Expath's parser buffer. This usually corrupts glibc's allocator metadata and subsequent free() terminates the program with SIGABORT. The buffer overflow can be seen in valgrind output. See <https://bugzilla.redhat.com/show_bug.cgi?id=1658512> for the reproducer. A fix is attached.
Subject: 0001-Fix-a-buffer-overwrite-in-parse_stream.patch
From 53e71571fc0b1f8dbad5f7ff6e9eeeb233496c13 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Petr=20P=C3=ADsa=C5=99?= <ppisar@redhat.com> Date: Thu, 13 Dec 2018 13:05:07 +0100 Subject: [PATCH] Fix a buffer overwrite in parse_stream() The parse_stream() function allocates BUFSIZE-byte long output buffer. Then it reads a string using PerlIO's read() with a maximal string length tsiz=BUFSIZE characters into a temporary buffer. And then it retrieves a length of the string in the temporary buffer in bytes and copies the strings from the temporary buffer to the output buffer. While it works for byte-stream file handles, when using UTF-8 handles, length in bytes can be greater than length in characters, thus the temporary buffer can contain more bytes than the size of the output buffer and we have a buffer overwrite. This corrupts memory, especially metadata for libc memory management and subsequent free() aborts with "free(): invalid next size (normal)". Minimal reproducer: Execute this code with an UTF-8 encoded file with non-ASCII charcters on the standard input: use XML::XPath; use open ':std', ':encoding(UTF-8)'; my $xpath = XML::XPath->new(ioref => \*STDIN); $xpath->find('/'); https://bugzilla.redhat.com/show_bug.cgi?id=1473368 https://bugzilla.redhat.com/show_bug.cgi?id=1658512 --- Expat/Expat.xs | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/Expat/Expat.xs b/Expat/Expat.xs index ed66531..dbad380 100644 --- a/Expat/Expat.xs +++ b/Expat/Expat.xs @@ -343,8 +343,8 @@ parse_stream(XML_Parser parser, SV * ioref) } else { tbuff = newSV(0); - tsiz = newSViv(BUFSIZE); - buffsize = BUFSIZE; + tsiz = newSViv(BUFSIZE); /* in UTF-8 characters */ + buffsize = BUFSIZE * 6; /* in bytes that encode an UTF-8 string */ } while (! done) @@ -386,9 +386,11 @@ parse_stream(XML_Parser parser, SV * ioref) croak("read error"); tb = SvPV(tbuff, br); - if (br > 0) + if (br > 0) { + if (br > buffsize) + croak("The input buffer is not large enough for read UTF-8 decoded string"); Copy(tb, buffer, br, char); - else + } else done = 1; PUTBACK ; -- 2.18.1
Dne Čt 13.pro.2018 08:06:17, ppisar napsal(a):
Show quoted text
The input XML document can be generated like this: $ perl -CO -e 'print q{<root>} . qq{\x{010d}} x 2**16 . q{</root>}' | /usr/bin/xpath -q -e '/foo' Ran out of memory for input buffer at /usr/lib64/perl5/vendor_perl/XML/Parser/Expat.pm line 474. double free or corruption (!prev) Aborted (core dumped)
I've taken your patch. It would be good if we had a test :(
On Mon Sep 23 23:28:31 2019, TODDR wrote:
Show quoted text
> I've taken your patch. It would be good if we had a test :(
https://github.com/toddr/XML-Parser/commit/56b0509dfc6b559cd7555ea81ee62e3622069255


This service runs on Request Tracker, is sponsored by The Perl Foundation, and maintained by Best Practical Solutions.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.