Skip Menu |
 

This queue is for tickets about the File-Slurp CPAN distribution.

Report information
The Basics
Id: 103986
Status: resolved
Priority: 0/
Queue: File-Slurp

People
Owner: cwhitener [...] gmail.com
Requestors: ROBINS [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: (no value)
Fixed in: (no value)



Subject: JSON::PP is unable to decode JSON with newlines between tokens
MIME-Version: 1.0
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
Message-ID: <rt-4.0.18-15564-1430069000-1098.0-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 589
Download (untitled) / with headers
text/plain 589b
The file pretty.json below does not parse, but compact.json parses. This is tested on both perl 5.18.2 and perl 5.20.2 on Ubuntu 14.04 (64-bit). The 5.18.2 perl version is system perl. 5.20.2 is perlbrewed with --thread --multi arguments. $ cat pretty.json { "a":"b", "c":1 } $ cat compact.json {"a":"b","c":1} $ perl -MFile::Slurp -MJSON::PP -E 'say decode_json read_file shift' compact.json HASH(0xa06d18) $ perl -MFile::Slurp -MJSON::PP -E 'say decode_json read_file shift' pretty.json , or } expected while parsing object/hash, at character offset 1 (before "\n") at -e line 1. $
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-15564-1430069000-1098.0-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-15564-1430069000-1098.0-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-28260-1430069750-892.103986-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 198
Download (untitled) / with headers
text/plain 198b
It seems this is actually a bug with File::Slurp. $ perl -MIO::All -MJSON::PP -E 'say decode_json io->file(shift)->all' pretty.json HASH(0x10b2350) Can someone move it over to File::Slurp's queue?
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-28260-1430069750-892.103986-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-15564-1430069000-1098.0-0-0 [...] rt.cpan.org> <rt-4.0.18-28260-1430069750-892.103986-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-15564-1430072637-880.103986-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 394
Download (untitled) / with headers
text/plain 394b
Actually, adding in an explicit "scalar" before read_file solves the problem. The question is: Has JSON::PP's decode_json() ever had a prototype to enforce scalar context? Because if you have JSON::XS installed you don't need the "scalar". It seems it JSON::XS::decode_json has a prototype to enforce scalar context. This is confusing... Maybe it it is a bug belonging to JSON::PP after all...
MIME-Version: 1.0
X-Spam-Status: No, score=-5.9 tagged_above=-99.9 required=10 tests=[BAYES_00=-1.9, FROM_OUR_RT=-4] autolearn=ham
In-Reply-To: <rt-4.0.18-15564-1430072637-828.103986-5-0 [...] rt.cpan.org>
X-Spam-Flag: NO
X-RT-Interface: API
References: <RT-Ticket-103986 [...] rt.cpan.org> <rt-4.0.18-15564-1430069000-1098.103986-5-0 [...] rt.cpan.org> <rt-4.0.18-28260-1430069750-892.103986-5-0 [...] rt.cpan.org> <rt-4.0.18-15564-1430072637-828.103986-5-0 [...] rt.cpan.org>
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
Message-ID: <553D326C.7000701 [...] stemsystems.com>
content-type: text/plain; charset="utf-8"; format="flowed"
Organization: Perl Hunter
X-RT-Original-Encoding: utf-8
X-Spam-Score: -5.9
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id 31E8D2403CC for <cpan-bug+File-Slurp [...] hipster.bestpractical.com>; Sun, 26 Apr 2015 14:46:19 -0400 (EDT)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uqSPFxB6uVd0 for <cpan-bug+File-Slurp [...] hipster.bestpractical.com>; Sun, 26 Apr 2015 14:46:17 -0400 (EDT)
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id 6A8E224005E for <bug-File-Slurp [...] rt.cpan.org>; Sun, 26 Apr 2015 14:46:17 -0400 (EDT)
Received: (qmail 25037 invoked by alias); 26 Apr 2015 18:46:14 -0000
Received: from sysarch.com (HELO sysarch.com) (65.49.50.30) by la.mx.develooper.com (qpsmtpd/0.28) with SMTP; Sun, 26 Apr 2015 11:46:08 -0700
Received: from ::ffff:96.237.240.28 ([96.237.240.28]) by sysarch.com for <bug-File-Slurp [...] rt.cpan.org>; Sun, 26 Apr 2015 11:46:06 -0700
Delivered-To: cpan-bug+File-Slurp [...] hipster.bestpractical.com
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.0
Subject: Re: [rt.cpan.org #103986] JSON::PP is unable to decode JSON with newlines between tokens
Return-Path: <uri [...] stemsystems.com>
X-Spam-Check-BY: la.mx.develooper.com
X-Original-To: cpan-bug+File-Slurp [...] hipster.bestpractical.com
X-RT-Mail-Extension: file-slurp
Date: Sun, 26 Apr 2015 14:46:04 -0400
X-Spam-Level:
To: bug-File-Slurp [...] rt.cpan.org
Content-Transfer-Encoding: 8bit
From: Uri Guttman <uri [...] stemsystems.com>
RT-Message-ID: <rt-4.0.18-28104-1430073979-1709.103986-0-0 [...] rt.cpan.org>
Content-Length: 686
Download (untitled) / with headers
text/plain 686b
On 04/26/2015 02:23 PM, Robin Smidsrød via RT wrote: Show quoted text
> Queue: File-Slurp > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=103986 > > > Actually, adding in an explicit "scalar" before read_file solves the problem. The question is: Has JSON::PP's decode_json() ever had a prototype to enforce scalar context? Because if you have JSON::XS installed you don't need the "scalar". It seems it JSON::XS::decode_json has a prototype to enforce scalar context. > > This is confusing... Maybe it it is a bug belonging to JSON::PP after all...
it is absolutely a bug in json::pp. read_file has a well defined and stable api with regard to scalar vs list context. thanx, uri
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-15564-1430072637-880.103986-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-15564-1430069000-1098.0-0-0 [...] rt.cpan.org> <rt-4.0.18-28260-1430069750-892.103986-0-0 [...] rt.cpan.org> <rt-4.0.18-15564-1430072637-880.103986-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-16636-1430158889-935.103986-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 522
Download (untitled) / with headers
text/plain 522b
On 2015-04-26 14:23:57, ROBINS wrote: Show quoted text
> Actually, adding in an explicit "scalar" before read_file solves the > problem. The question is: Has JSON::PP's decode_json() ever had a > prototype to enforce scalar context? Because if you have JSON::XS > installed you don't need the "scalar". It seems it > JSON::XS::decode_json has a prototype to enforce scalar context.
See here: https://metacpan.org/source/MLEHMANN/JSON-XS-3.01/XS.xs#L2201 Show quoted text
> This is confusing... Maybe it it is a bug belonging to JSON::PP after > all...
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-16636-1430158889-935.103986-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-15564-1430069000-1098.0-0-0 [...] rt.cpan.org> <rt-4.0.18-28260-1430069750-892.103986-0-0 [...] rt.cpan.org> <rt-4.0.18-15564-1430072637-880.103986-0-0 [...] rt.cpan.org> <rt-4.0.18-16636-1430158889-935.103986-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-16141-1432276661-1690.103986-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 369
Download (untitled) / with headers
text/plain 369b
The problem is with how the file is loaded from disk, in its handling of utf8-encoded characters. Basically, if there is any UTF-8 *anywhere* in the document, the entire file is decoded to characters and utf8-flagged, and that breaks uses of substr() in perl 5.8.6 and below. It is a mistake to try to infer the encoding of the document by inspecting its contents.
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-16141-1432276661-1690.103986-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-15564-1430069000-1098.0-0-0 [...] rt.cpan.org> <rt-4.0.18-28260-1430069750-892.103986-0-0 [...] rt.cpan.org> <rt-4.0.18-15564-1430072637-880.103986-0-0 [...] rt.cpan.org> <rt-4.0.18-16636-1430158889-935.103986-0-0 [...] rt.cpan.org> <rt-4.0.18-16141-1432276661-1690.103986-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-32308-1432276739-1196.103986-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 599
Download (untitled) / with headers
text/plain 599b
On 2015-05-21 23:37:41, ETHER wrote: Show quoted text
> The problem is with how the file is loaded from disk, in its handling > of utf8-encoded characters. > > Basically, if there is any UTF-8 *anywhere* in the document, the > entire file is decoded to characters and utf8-flagged, and that breaks > uses of substr() in perl 5.8.6 and below. > > It is a mistake to try to infer the encoding of the document by > inspecting its contents.
Sorry, this was a response to the wrong ticket (which has the same error result -- I was describing the issue that was fixed here -- https://github.com/makamaka/JSON-PP/pull/9)
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-28104-1430073979-1709.103986-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <RT-Ticket-103986 [...] rt.cpan.org> <rt-4.0.18-15564-1430069000-1098.103986-5-0 [...] rt.cpan.org> <rt-4.0.18-28260-1430069750-892.103986-5-0 [...] rt.cpan.org> <rt-4.0.18-15564-1430072637-828.103986-5-0 [...] rt.cpan.org> <553D326C.7000701 [...] stemsystems.com> <rt-4.0.18-28104-1430073979-1709.103986-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-15162-1539187726-55.103986-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 960
Download (untitled) / with headers
text/plain 960b
On Sun Apr 26 14:46:19 2015, uri@stemsystems.com wrote: Show quoted text
> On 04/26/2015 02:23 PM, Robin Smidsrød via RT wrote:
> > Queue: File-Slurp > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=103986 > > > > > Actually, adding in an explicit "scalar" before read_file solves the > > problem. The question is: Has JSON::PP's decode_json() ever had a > > prototype to enforce scalar context? Because if you have JSON::XS > > installed you don't need the "scalar". It seems it > > JSON::XS::decode_json has a prototype to enforce scalar context. > > > > This is confusing... Maybe it it is a bug belonging to JSON::PP after > > all...
> it is absolutely a bug in json::pp. read_file has a well defined and > stable api with regard to scalar vs list context. > > thanx, > > uri
I concur. ##### $ cat pretty.json { "a":"b", "c":1 } $ perl -MFile::Slurp -MJSON::PP -E 'my $str = read_file shift; chomp $str; say $str' pretty.json { "a":"b", "c":1 } #####
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-15564-1430069000-1098.0-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-15564-1430069000-1098.0-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-15162-1539194710-987.103986-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 654
Download (untitled) / with headers
text/plain 654b
Hi Everyone, Since `read_file` uses wantarray to detect context, it returns @lines in list context or $text in scalar context. You can't simply embed the function call in another function because that assumes list context. The function `decode_json` accepts a byte string, not a list of byte strings. I don't think we can do anything about this, as Uri said, the interface for `read_file` has long been defined and thus cannot easily be altered for use in this manner. I'm going to close this ticket out as the appropriate means to accomplish the goal is to make the call to `read_file` in scalar context before calling `decode_json`. Thanks, Chase


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.