|Subject:||HTML::Entities misses at least one Unicode (high bit) Character|
I think I've found a problem which causes HTML::Entities to miss an entity when encoding (both numeric and normal). I've attached a TGZ that includes a small snippet of malformed UTF8 and a small test that demonstrates the problem. Here's how I'd show it: % tar xvf missedentity.tgz % ./go.pl > out % vi out The "out" file will contain: Einar [Aacute]gú Frið Of course, the [Aacute] should have been encoded. I know this is easy to say, and very annoying, but given this entity is missing, how many others may also be missing? My system details: Redhat Fedora 4 Perl 5.8.6 HTML::Parser 3.50 HTML::Entities 1.32
Message body not shown because it is not plain text.