Skip Menu | will be shut down on March 1st, 2021.

This queue is for tickets about the RTF-Parser CPAN distribution.

Report information
The Basics
Id: 43270
Status: new
Priority: 0/
Queue: RTF-Parser

Owner: stuart [...]
Requestors: jferguson [...]

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)

Subject: bug in RTF::Parser
Date: Fri, 13 Feb 2009 14:51:04 -0700
To: <bug-RTF-Parser [...]>
From: <jferguson [...]>
Download (untitled) / with headers
text/plain 1.2k
I'm calling RTF::TEXT::Converter which in turn is calling RTF::Parser. When the RTF text contains Japanese SJIS characters some of the bytes are being corrupted because they are being translated with the module. As an example I can set the following string: {\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fscript\fprq2\fcharset0 Comic Sans MS;}{\f1\froman\fprq1\fcharset128 MS PGothic;}}{\colortbl ;\red0\green0\blue128;}{\*\generator Msftedit;}\viewkind4\uc1\pard\tx720\cf1\f0\fs20 test \cf0\lang1041\f1\fs20\'83\'65\'83\'58\'83\'67\cf1\lang1033\f0\fs20\par} The results I get back are: test fefXfg 0x00000000 (00000) 74657374 20666566 5866670a test fefXfg. Where the 'f' (0x66) character is located should be 0x83. For whatever reason the file has a translation of 0x83 to an 'f'. This ends up corrupting the resultant SJIS string. As a test I removed the entry from for 83 and the resultant string contains the correct 0x83 character. test âeâXâg 0x00000000 (00000) 74657374 20836583 5883670a test .e.X.g. There needs to be some sort of check so that if the lang1041 is set it should not attempt to translate characters to something else.

This service is sponsored and maintained by Best Practical Solutions and runs on infrastructure.

Please report any issues with to