Skip Menu |
 

This queue is for tickets about the HTML-Tree CPAN distribution.

Report information
The Basics
Id: 33250
Status: resolved
Priority: 0/
Queue: HTML-Tree

People
Owner: Nobody in particular
Requestors: grandpa [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 3.23
Fixed in: (no value)



Subject: Nested div elements cause contained p elements to migrate outside the div
Download (untitled) / with headers
text/plain 148b
html of the form: <p><div><p>foo</p></div></p> is parsed into the tree as: <p><div></div></p> <p>foo</p> Versions prior to 3.23 not checked.
Subject: noname.pl
Download noname.pl
text/x-perl 599b
use strict; use warnings; use lib '..'; use HTML::TreeBuilder; print "$HTML::TreeBuilder::VERSION\n"; my $html = <<'HTML'; <html> <head> </head> <body> <p><div><p>foo</p></div></p> </body> </html> HTML my $root = HTML::TreeBuilder->new; $root->parse_content ($html); $root->elementify (); my @renderOptions = (undef, ' ', {}); print $root->as_HTML (@renderOptions); # Sample output follows __DATA__ __DATA__ <html> <head> </head> <body> <p> <div> </div> </p> <p>foo</p> </body> </html>
Download (untitled) / with headers
text/plain 532b
On Thu Feb 14 04:58:19 2008, GRANDPA wrote: Show quoted text
> html of the form: > > <p><div><p>foo</p></div></p> > > is parsed into the tree as: > > <p><div></div></p> <p>foo</p> > > Versions prior to 3.23 not checked.
The issue can be fixed by adding 'div' to HTML::Tagset::p_closure_barriers. In the HTML::TreeBuilder constructor this could be done by: push @HTML::Tagset::p_closure_barriers, 'div' unless grep {$_ eq 'div'} @HTML::Tagset::p_closure_barriers; or it may be more appropriate to make a suitable change in HTML::Tagset
Subject: Re: [rt.cpan.org #33250] Nested div elements cause contained p elements to migrate outside the div
Date: Thu, 14 Feb 2008 15:14:41 -0600
To: bug-HTML-Tree [...] rt.cpan.org
From: Andy Lester <andy [...] petdance.com>
Download (untitled) / with headers
text/plain 442b
On Feb 14, 2008, at 1:27 PM, Peter Jaquiery via RT wrote: Show quoted text
> The issue can be fixed by adding 'div' to > HTML::Tagset::p_closure_barriers. > > In the HTML::TreeBuilder constructor this could be done by: > > push @HTML::Tagset::p_closure_barriers, 'div' unless grep {$_ eq > 'div'} @HTML::Tagset::p_closure_barriers;
I can do that. It'll be pretty trivial. -- Andy Lester => andy@petdance.com => www.petdance.com => AIM:petdance
Subject: Re: [rt.cpan.org #33250] AutoReply: Nested div elements cause contained p elements to migrate outside the div
Date: Wed, 20 Feb 2008 23:17:01 +1300
To: <bug-HTML-Tree [...] rt.cpan.org>
From: "Peter Jaquiery" <peter.jaquiery [...] ihug.co.nz>
Download (untitled) / with headers
text/plain 596b
On further consideration and following discussion with others it seems likely that the fundamental problem is that div elements ought not nest inside p elements. <p><div><p>foo</p></div></p> should parse as: <p></p><div><p>foo</p></div> with the trailing </p> in the input ignored because the HTML is fundamentally broken. This parse issue can be fixed in the 'ALL HOPE ...' section by changing the implicit ending list to include 'div'. Changing line 414 from if ($tag eq 'p' or to: if ($tag eq 'p' or $tag eq 'div' or would do the trick. Cheers, Peter Jaquiery
I've uplaoded HTML::Tagset 3.20.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.