Skip Menu |
 

This queue is for tickets about the Config-IniFiles CPAN distribution.

Report information
The Basics
Id: 59152
Status: resolved
Priority: 0/
Queue: Config-IniFiles

People
Owner: Nobody in particular
Requestors: meir [...] guttman.co.il
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



CC: shlomif [...] iglu.org.il
Subject: UTF-8 (and other Unicode encodings?) BOM cause the package to fail
Date: Wed, 07 Jul 2010 09:42:45 +0300
To: bug-Config-IniFiles [...] rt.cpan.org
From: Meir Guttman <meir [...] guttman.co.il>
Download (untitled) / with headers
text/plain 979b
Dear folks, The other day I discovered the hard way that the Config::IniFiles package fails to process UTF-8 Unicode encoded INI files when the file also includes a BOM (Byte Order Marker) signature. Attached are two INI files, one with a BOM, another is without. Other than this the two are identical. As anyone can see (in a Hex view), the 3-byte BOM at the very beginning of the BOM file is "EF BB BF". Also attached is a small Perl Script to demonstrate the result. (You have of course to edit it to switch between the BOM and the no-BOM versions.) The outcome of it when using the BOM INI file is: Line 1 in file utf8_bom.ini is mal-formed: ∩�[┐[General] 2: parameter found outside a section Please note the three "garbage" characters on my (Hebrew) cmd window. As for a correcting patch, I am afraid I am too much of a newbie to offer that. But may be all which is required is a "use encoding 'utf8';" statement? Regards, Meir
Download utf8_bom.ini
application/octet-stream 196b

Message body not shown because it is not plain text.

Download utf8_bom_test.pl
text/x-perl 338b

Message body is not shown because sender requested not to inline it.

Download utf8_no-bom.ini
application/octet-stream 193b

Message body not shown because it is not plain text.

Download (untitled) / with headers
text/plain 1.6k
On Wed Jul 07 02:43:03 2010, meir@guttman.co.il wrote: Show quoted text
> Dear folks, > > > > The other day I discovered the hard way that the Config::IniFiles package > fails to process UTF-8 Unicode encoded INI files when the file also
includes Show quoted text
> a BOM (Byte Order Marker) signature. > > > > Attached are two INI files, one with a BOM, another is without. Other than > this the two are identical. As anyone can see (in a Hex view), the 3-byte > BOM at the very beginning of the BOM file is "EF BB BF". > > > > Also attached is a small Perl Script to demonstrate the result. (You
have of Show quoted text
> course to edit it to switch between the BOM and the no-BOM versions.) The > outcome of it when using the BOM INI file is: > > > > Line 1 in file utf8_bom.ini is mal-formed: > > ∩�[┐[General] > > 2: parameter found outside a section > > > > Please note the three "garbage" characters on my (Hebrew) cmd window. > > > > As for a correcting patch, I am afraid I am too much of a newbie to offer > that. But may be all which is required is a "use encoding 'utf8';" > statement? >
After playing a little with your script, I found that this version works fine: {{{{{{{{{{{{{{{{ #!/usr/bin/perl use strict; use warnings; # use encoding "utf8"; # use open IO => ":encoding(utf8)"; use Config::IniFiles; my $cfg = Config::IniFiles->new(-file => "utf8_bom.ini") or do { my $err_message = join("\n", @Config::IniFiles::errors); die "$err_message\n"; }; my $cookie_jar = $cfg->val('General', 'cookie_jar'); print "Jar: $cookie_jar\n"; __END__ }}}}}}}}}}}}}}}} What do you need the "use open" call for? Regards, -- Shlomi Fish Show quoted text
> > > Regards, > > Meir >
Rejected due to lack of responsiveness from the reporter. If you wish to re-open, then comment.
Download (untitled) / with headers
text/plain 202b
On Fri Nov 19 09:02:18 2010, SHLOMIF wrote: Show quoted text
> Rejected due to lack of responsiveness from the reporter. If you wish to > re-open, then comment.
Reopening per the responsiveness of the reporter (Meir).
Download (untitled) / with headers
text/plain 1.4k
On Sat Jan 09 14:18:33 2016, SHLOMIF wrote: Show quoted text
> On Fri Nov 19 09:02:18 2010, SHLOMIF wrote:
> > Rejected due to lack of responsiveness from the reporter. If you wish to > > re-open, then comment.
> > Reopening per the responsiveness of the reporter (Meir).
Meir sent me a reproducing test case in private and I was able to fix it after referring to : < QUOTE > Thanks for the modified program - I was able to rework it into a usable, reproducing, condition. Now to the solution: searching http://duckduckgo.com/ for https://duckduckgo.com/?q=perl%20utf8%20bom yielded this Perl Monks thread - http://www.perlmonks.org/?node_id=599720 where https://metacpan.org/pod/File::BOM was recommended. After installing it and playing a little with it, I was able to create a Perl program that yields the same result with and without the BOM. I've attached it to this message in .7z format: ««« shlomif@telaviv1:~$ perl EoD-shlomif.pl m-with-bom.ini Node root: D:/Meir Log DIR root: WorkLOGs shlomif@telaviv1:~$ perl EoD-shlomif.pl m-without-bom.ini Node root: D:/Meir Log DIR root: WorkLOGs shlomif@telaviv1:~$ »»» Hope it helps. This problem is not specific to Config-IniFiles, but rather an issue with the way Perl 5 is implemented. And there's an easy solution on CPAN. Regards, Shlomi Fish < QUOTE > I've attached what I sent to Meir and people may refer to this answer here for more insights. resolving as NOT-A-BUG.
Subject: Meir-Config-IniFiles-BOM-Shlomif.7z
Download Meir-Config-IniFiles-BOM-Shlomif.7z
application/x-7z-compressed 987b

Message body not shown because it is not plain text.

resolving


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.