Skip Menu |
 

This queue is for tickets about the HTML-Tagset CPAN distribution.

Report information
The Basics
Id: 67299
Status: open
Priority: 0/
Queue: HTML-Tagset

People
Owner: Nobody in particular
Requestors: KENTNL [...] cpan.org
MIROD [...] cpan.org
Cc: JEB [...] cpan.org
AdminCc:

Bug Information
Severity: Important
Broken in: 3.20
Fixed in: (no value)



CC: JEB [...] cpan.org
Subject: Need an HTML5 update
Download (untitled) / with headers
text/plain 829b
I believe HTML::Tagset needs to be updated for HTML5. That includes new attributes (there is already a ticket on that) but also new elements. As long as they are not listed in HTML::Tagset the new elements are silently discarded. If you want a patch, let me know (I don't need this myself, but I'd put the time into it if it's useful and if the patch gets applied). An example of the problem is shown by the following code: perl -MHTML::TreeBuilder -e'print HTML::TreeBuilder->new_from_content( "<html><head></head><body><section><p>foo</p></body></html>")->as_HTML' <html><head></head><body><p>foo</body></html> If 'section' is added as a tag in %isBodyElement (and in @p_closure_barriers just to be on the safe side), then the result is better: <html><head></head><body><section><p>foo</section></body></html> __ mirod
Download (untitled) / with headers
text/plain 186b
Any chance of getting HEADER and SECTION added as body elements? It would fix tools like Web::Scraper that rely on this module. If this were on github I'd happily submit a pull request.
Download (untitled) / with headers
text/plain 173b
If we're going to handle HTML5, it's not going to be through a "Can't you just do X". Adding elements at random and pushing out something isn't a good long-term solution.
Download (untitled) / with headers
text/plain 235b
On Thu Nov 03 23:13:47 2011, PETDANCE wrote: Show quoted text
> If we're going to handle HTML5, it's not going to be through a "Can't > you just do X". Adding elements at random and pushing out something > isn't a good long-term solution. >
Thanks!
Download (untitled) / with headers
text/plain 1.9k
On Fri Nov 04 03:13:47 2011, PETDANCE wrote: Show quoted text
> If we're going to handle HTML5, it's not going to be through a "Can't > you just do X". Adding elements at random and pushing out something > isn't a good long-term solution. >
Andy, do you know what the status of the patch I sent to the libwww mailing list on May 20 with support for HTML 5 elements? I saw no response from you or aytone else. Here 'tis again: --- 3.20/Tagset.pm 2011-05-20 11:50:41.987911127 +0800 +++ new/Tagset.pm 2011-05-20 12:24:38.519497374 +0800 @@ -95,6 +95,7 @@ 'a' => ['href'], 'applet' => ['archive', 'codebase', 'code'], 'area' => ['href'], + 'audio' => ['src'], 'base' => ['href'], 'bgsound' => ['src'], 'blockquote' => ['cite'], @@ -115,10 +116,13 @@ 'object' => ['classid', 'codebase', 'data', 'archive', 'usemap'], 'q' => ['cite'], 'script' => ['src', 'for'], + 'source' => ['src'], 'table' => ['background'], 'td' => ['background'], 'th' => ['background'], 'tr' => ['background'], + 'track' => ['src'], + 'video' => ['poster'], 'xmp' => ['href'], ); @@ -185,6 +189,7 @@ wbr nobr blink font basefont bdo spacer embed noembed + time mark ruby rp rt bdi bdo ); # had: center, hr, table @@ -253,7 +258,7 @@ =cut %isFormElement = map {; $_ => 1 } - qw(input select option optgroup textarea button label); + qw(input select option optgroup textarea button label keygen output progress meter ); =head2 hashset %HTML::Tagset::isBodyElement @@ -285,6 +290,10 @@ table center form + + section nav article aside hgroup figure + param video audio source track canvas + details summary command menu ), keys %isFormElement, keys %isPhraseMarkup, # And everything phrasal @@ -313,7 +322,7 @@ %isKnown = (%isHeadElement, %isBodyElement, map{; $_=>1 } qw( head body html - frame frameset noframes + frame frameset noframes figcaption ~comment ~pi ~directive ~literal )); # that should be all known tags ever ever
Download (untitled) / with headers
text/plain 145b
I'm not going to "just" add some tags. HTML::Tagset needs a better thought-out plan than just adding tags, because then we're not being HTML4.
Download (untitled) / with headers
text/plain 130b
I'm glad that won't be an excuse to stop updating this module after "HTML5" is done. http://blog.whatwg.org/html-is-the-new-html5
Download (untitled) / with headers
text/plain 239b
On Fri Nov 04 00:43:34 2011, LEEDO wrote: Show quoted text
> I'm glad that won't be an excuse to stop updating this module after > "HTML5" is done.
I'm not sure what your point is, but if you want to discuss a solution without sarcasm, I'm glad to do it.
Subject: [rt.cpan.org #67299]
Date: Fri, 10 Oct 2014 20:54:26 +0100
To: bug-HTML-Tagset [...] rt.cpan.org
From: redneb [...] gmx.com
Download (untitled) / with headers
text/plain 1.1k
I recently released a haskell library that provides the functionality of %HTML::Tagset::linkElements. It includes support for HTML5. Trying to find a good source with a list of HTML5 link elements/attributes I discovered that there is a proprietary program called XMLmind XML Editor 6.0.0 whose Evaluation Edition [1] contains an XML Schema file for HTML 5 which is BSD licensed. You can grab a copy of that file from [2]. Additionally, I wrote a small haskell program [3] that extracts all link elements from that file. If you don't want to run it yourself, here's its output: a href area href audio src base href blockquote cite button formaction command icon del cite embed src form action html manifest iframe src img src input formaction input src ins cite link href object data q cite script src source src track src video poster video src This is the complete list of tags/attributes whose XML Schema type is xs:anyURI. [1] http://www.xmlmind.com/xmleditor/download.shtml [2] https://github.com/redneb/islink/blob/master/scripts/data/xhtml5.xsd [3] https://github.com/redneb/islink/blob/master/scripts/from_xsd.hs
Download (untitled) / with headers
text/plain 196b
Hey Andy, did you have anything in mind for a better way for doing this for HTML5? I have a couple of related bugs opened for HTML::TreeBuilder and thought I might give it a crack. Cheers, Jeff.
Download (untitled) / with headers
text/plain 966b
I hand to say "me too" - but me too. Yes, HTML has become a moving target, but some things are stable; despite the "deprecated" and "obsolete" status of some elements, the fact is that browsers don't remove them. So it agglomerates stuff. I don't quite understand the comment from a few years back "we need a plan" - "cant just add tags because then we're not HTML4". Things kept being added to HTML4 browsers when HTML5 was being, er, evolved. And HTML5 is following the same track - it doesn't end, it just evolves. I guess it's now "WHATWG HTML Lifing Standard"... So, what's wrong with adding the tags that exist in the wild? It's not perfect, but then the ticket has been open for almost 7 years; HTML is out there... My interest is as a user of HTML::TreeBuilder. I may be missing why sticking with the "HTML" set of tags is advantageous. Is purity in some sense trumping the practical problem of keeping up with today's (and tomorrow's) content?
Subject: Re: [rt.cpan.org #67299] Need an HTML5 update
Date: Wed, 10 May 2017 16:13:04 -0500
To: bug-HTML-Tagset [...] rt.cpan.org
From: Andy Lester <andy [...] petdance.com>
Download (untitled) / with headers
text/plain 159b
Show quoted text
> So, what's wrong with adding the tags that exist in the wild?
Because people who expect HTML::Tagset to be HTML4 will have it changed out from under them.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.