Skip Menu |
 

This queue is for tickets about the HTML-Tree CPAN distribution.

Report information
The Basics
Id: 19724
Status: rejected
Priority: 0/
Queue: HTML-Tree

People
Owner: Nobody in particular
Requestors: ddascalescu+cpan [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 3.20
Fixed in: (no value)



Subject: Can't distinguish among ending tags
Download (untitled) / with headers
text/plain 501b
Consider this HTML code: <p>Line 1.<br>Line 2.<br /></p> My application needs to output the HTML with an RTF layer, preserving as much as possible from the HTML layout/spacing. The problem I'm running into is that HTML::Element doesn't allow me to detect that the <br> tag has no closing tag, the <br /> element is XML-properly empty (bonus points for being able to detect the space before the ending slash), and <p> does have a closing tag. Is there a way to distinguish between these three cases?
Download (untitled) / with headers
text/plain 646b
On Tue Jun 06 02:49:23 2006, guest wrote: Show quoted text
> The problem I'm running > into is that HTML::Element doesn't allow me to detect that the <br> tag > has no closing tag, the <br /> element is XML-properly empty (bonus > points for being able to detect the space before the ending slash), and > <p> does have a closing tag. Is there a way to distinguish between these > three cases?
<br /> vs. <br> is possible: the former will have an attribute of '/', where the latter will not. However, this only works if the element is self contained. As far as <p> having a closing tag, I'm evaluating how possible that is, but I'm focusing on bug fixes first.
From: grandpa [...] cpan.org
Download (untitled) / with headers
text/plain 412b
On Tue Jun 06 02:49:23 2006, guest wrote: Show quoted text
> Consider this HTML code: > > <p>Line 1.<br>Line 2.<br /></p> > > ... The problem I'm running > into is that HTML::Element doesn't allow me to detect that the <br> tag > has no closing tag ...
<br> never has a close tag - that is, <br>...</br> is not legal HTML or XHTML. <br /> is not a close tag, it is an empty br element. Note that br elements are always empty!
From: ddascalescu+perl [...] gmail.com
Download (untitled) / with headers
text/plain 776b
On Fri Feb 22 18:59:31 2008, GRANDPA wrote: Show quoted text
> On Tue Jun 06 02:49:23 2006, guest wrote:
> > Consider this HTML code: > > > > <p>Line 1.<br>Line 2.<br /></p> > > > > ... The problem I'm running > > into is that HTML::Element doesn't allow me to detect that the > > <br> tag has no closing tag ...
> > <br> never has a close tag - that is, <br>...</br> is not legal HTML or > XHTML. <br /> is not a close tag, it is an empty br element. Note that > br elements are always empty!
What I meant is that I'm trying to tell between <br>, <br/> and <br /> because my application must preserve the input HTML as faithfully as possible. It has been suggested above that <br/> will have an attribute of '/', while <br> won't. How can I also get the amount of whitespace in <br /> ?
On Sun Feb 24 11:44:56 2008, dandv wrote: Show quoted text
> On Fri Feb 22 18:59:31 2008, GRANDPA wrote:
> > On Tue Jun 06 02:49:23 2006, guest wrote:
> > > Consider this HTML code: > > > > > > <p>Line 1.<br>Line 2.<br /></p> > > > > > > ... The problem I'm running > > > into is that HTML::Element doesn't allow me to detect that the > > > <br> tag has no closing tag ...
> > > > <br> never has a close tag - that is, <br>...</br> is not legal HTML or > > XHTML. <br /> is not a close tag, it is an empty br element. Note that > > br elements are always empty!
> > What I meant is that I'm trying to tell between <br>, <br/> and <br /> > because my application must preserve the input HTML as faithfully as > possible. It has been suggested above that <br/> will have an attribute > of '/', while <br> won't. How can I also get the amount of whitespace in > <br /> ?
What you want can't be done using HTML::Parser, which is what HTML::TreeBuilder uses for parsing HTML. You would need to patch HTML::Parser to be able to get the white space information you want. Cheers, Jeff.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.