Skip Menu |
 

This queue is for tickets about the HTML-Tree CPAN distribution.

Report information
The Basics
Id: 27288
Status: rejected
Priority: 0/
Queue: HTML-Tree

People
Owner: Nobody in particular
Requestors: johannes.egger [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: as_HTML() for frameset html code produces implicit <body> tag
Date: Fri, 25 May 2007 17:04:52 +0200
To: bug-HTML-Tree [...] rt.cpan.org
From: "Johannes Egger" <johannes.egger [...] gmail.com>
Download (untitled) / with headers
text/plain 1.8k
I am having a problem with the following HTML code, which is part of a test suite: ---- <html> <head> <title>Frames and Iframes | Test 1</title> </head> <frameset cols="20%, *"> <frameset rows="100, 200, 300"> <frame longdesc="test_files/file1.html" /> <frame longdesc="../media/img/al-dayr_petra.jpg" /> <frame longdesc="does/not/exist" /> </frameset> <frame longdesc="test_files/file2.txt" /> <noframes> <p>this document contains frame test files, some of which do not really exist at all. </p> </noframes> </frameset> </html> ----- I am using the following perl code to load/process the HTML: ----- #!/usr/bin/perl use strict; use warnings; use HTML::TreeBuilder; use HTTP::Request; use LWP::UserAgent; my $source_URI = " http://development.tigerlair.org/projects/eowa/src/lib/AEL-Validator/t/frame_test/longdesc/frame_longdesc_test_1 . my $ua = LWP::UserAgent->new(agent => "BLURP"); my $response = $ua->get($source_URI); my ($tree, $psuccess); die "failed to fetch URI: $source_URI: $!\n" if not $response->is_success; eval { $tree = HTML::TreeBuilder->new; $tree->no_space_compacting(1); $tree->ignore_unknown(0); $tree->ignore_ignorable_whitespace(0); $tree->p_strict(1); $tree->store_comments(1); $tree->store_pis(1); $psuccess = $tree->parse($response->content); $tree->eof(); $tree->elementify; }; die "Parse failed: $!" if not $psuccess; print ($tree->as_HTML('', ' ')); exit 0 ----- The problem is that the output of as_HTML() contains the following snippet (only relevant part shown): ----- <noframes> <body> <p>this document contains frame test files, some of which do not really exist at all. </body> ----- It opens an implicit body tag when I think it should not. The html validates on the W3C validator (if I add the FRAMESET doctype at the top, which makes no difference to the parsing). Am I missing something? Johannes
Download (untitled) / with headers
text/plain 375b
Hi, sorry for the long delay, I'm the new maintainer so I'm going over all the old tickets. I believe it's compulsory in XHTML when using frameset to have a body in the noframes section, we wouldn't be able to make a valid as_XML export without adding the body at parse time, and since it's still optional in the non XHTML spec, it seems safer to leave it on. Cheers, Jeff.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.