|
[text/plain 1.7k]
On Wed May 07 03:45:23 2008, xmltwig[...]gmail.com wrote:
> Seth Viebrock via RT wrote:
> > Queue: XML-Twig
> > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=35672 >
> >
> > ...which ultimately ends up calling the following code in XML::Expat,
> > and hangs. A cold call to XML::Parser->parse does not yield this error,
> > so it seems related to the arguments that Twig is ultimately passing to
> > Expat.
> >
> > eval {
> > $result = $expat->parse($arg);
> > };
>
> Hi,
>
> The problem is that the simple call to expat doesn't includes any
> handlers. As XML::Twig builds the tree for the XML, OTOH, it kinda needs
> to set handlers on the various events.
>
> In this case the character handler is called for each line of the data,
> actually twice for each line, once for the data and once for the line
> return. So it ends up being called over 120 000 times for your example.
> That's always going to be longer than not calling the handler at all!
>
> The good news is that I made a mistake in that handler. I did not
> provide an explicit return: the returned value is not used in any way,
> so why bother? Why? Because as it was written it returned the partial
> content of the element. So it ended up passing 120 000 * 4Mb/2 (average
> size of the text content of the element) so 500G of data to be
> allocated, copied, and de-allocated (one hopes!). I added an explicit
> empty return and voilĂ ! Processing time went from 581s down to 2s.
>
> The new version is at the usual place: http://xmltwig.com/xmltwig/
>
> Thanks a lot for the bug report, this improvement should benefit most
> users (including me!)
>
Beautiful! Thanks so much for the quick response. This definitely saved
my hide, and I'm glad it will help others, too. Open source and
XML::Twig rule!
|