|
[text/plain 1.3k]
Gisle_Aas via RT wrote:
><URL: http://rt.cpan.org/Ticket/Display.html?id=17962 >
>
>The reason "&lang" is expanded is that its an official HTML entity
>name; see http://www.w3.org/TR/REC-html40/sgml/entities.html#h-24.3.1
>
>Browsers has used to expand entities even if the trailing ";" is
>missing, but there seems to be an exception for the non-Latin1
>entities out-there. I tested this piece of HTML in Firefox/Konqeror:
>
> <html>
> <body>
> <a href="foo?a=1ð=1×=3&lang=4&Gamma=5⟨=6">foo
>⟨&lang=</a>
> </body>
> </html>
>
>and they both expand "ð", "×" and "⟨" into the
>corresponding char but leaves "&lang" and "&Gamma" alone. Strangely
>enough Firefox expands "&lang" outside of the attribute so it actually
>plays by even more rules.
>
>HTML is such a mess!
>
HTML: it's getting better all the time (couldn't get much worse), to
coin a phrase...
If only everyone would agree with the standard. I don't have the energy
to track down the URI spec today, but logically (HTML/logic: ha!): the
semi-colon in ⟨ above ought to be URI-encoded, right? Otherwise it
might be interpreted as a new-style delimiter as the ampersand was the
old-style delimiter. What should happen when those two appaer together,
I duuno.
Ho hum.
Any thoughts how you might deal with the mess? My vote is to not look
for entities in URIs...
Cheers
lee
|