Skip Menu |
 

This queue is for tickets about the IO-Compress CPAN distribution.

Report information
The Basics
Id: 119184
Status: resolved
Priority: 0/
Queue: IO-Compress

People
Owner: Nobody in particular
Requestors: bottomsc [...] missouri.edu
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



CC: "Givan, Scott A." <givans [...] missouri.edu>, "Spollen, William G." <spollenw [...] missouri.edu>
Subject: Failure to read second "original" gzipped file inside a "concatenated" gzipped file
Date: Thu, 8 Dec 2016 18:08:21 +0000
To: "bug-IO-Compress [...] rt.cpan.org" <bug-IO-Compress [...] rt.cpan.org>
From: "Bottoms, Christopher A" <BottomsC [...] missouri.edu>
Download (untitled) / with headers
text/plain 1.8k
(This is mostly copied from my post on Stackoverflow<http://stackoverflow.com/questions/41045834>). In bash, you can concatenate gzipped files and the result is a valid gzipped file. As far as I recall, I have always been able to treat these "concatenated" gzipped files as normal gzipped files: echo 'Hello world!' > hello.txt echo 'Howdy world!' > howdy.txt gzip hello.txt gzip howdy.txt cat hello.txt.gz howdy.txt.gz > greetings.txt.gz gunzip greetings.txt.gz cat greetings.txt Which outputs Hello world! Howdy world! However, when trying to read this same file using Perl's core IO::Uncompress::Gunzip module<https://metacpan.org/pod/IO::Uncompress::Gunzip>, it doesn't get past the first original file. Here is the result: ./my_zcat greetings.txt.gz Hello world! Here is the code for my_zcat: #!/bin/env perl use strict; use warnings; use v5.10; use IO::Uncompress::Gunzip qw($GunzipError); my $file_name = shift; my $fh = IO::Uncompress::Gunzip->new($file_name) or die $GunzipError; while (defined(my $line = readline $fh)) { print $line; } If I totally decompress the files before creating a new gzipped file, I don't have this problem: zcat hello.txt.gz howdy.txt.gz | gzip > greetings_via_zcat.txt.gz ./my_zcat greetings_via_zcat.txt.gz Hello world! Howdy world! So, what is the difference between greetings.txt.gz and greetings_via_zcat.txt.gz and why might IO::Uncompress::Gunzip work correctly with greetings.txt.gz? I'm guessing that IO::Uncompress::Gunzip messes up because of the metadata between the files. But, since greetings.txt.gz is a valid Gzip file, I would expect IO::Uncompress::Gunzip to work. My workaround for now will be piping from zcat (which of course doesn't help Windows users much): #!/bin/env perl use strict; use warnings; use v5.10; my $file_name = shift; open(my $fh, '-|', "zcat $file_name"); while (defined(my $line = readline $fh)) { print $line; }
Download (untitled) / with headers
text/plain 145b
As I already mentioned on stackoverflow, this is covered in the FAQ in section "Dealing with concatenated gzip files". Marking this as resolved.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.