|Subject:||characters transliterated to non characters|
|Date:||Thu, 26 Jun 2014 22:12:41 +0200|
|To:||bug-Text-Unidecode [...] rt.cpan.org|
|From:||Apostol Karovski <zapirkon [...] gmail.com>|
For example 018F, 0259, and other Unicode characters transliterate to "@". It seems like characters with pronunciation similar to [æ] are transliterated to "@" Why not make them transliterate to "a" or "e" or "ae"? I am noting this because words should contain letters, but @ is not a letter and it almost always means "at". Furthermore, logically, if I have a word transliterated, for every character in the new word, Character.isLetter() should return True. Other thing that bothers me is the transliteration to numerical representation (Example: 0184, 0185, 018E are represented as "6", "6" and "3" accordingly. Here, 018E is maybe even wrong since it is described as reversed E) P.S. Are the tables in anyway managed? Can I get insight of how they are made and maintained?