Skip Menu |
 

This queue is for tickets about the IO-Compress CPAN distribution.

Report information
The Basics
Id: 92368
Status: open
Priority: 0/
Queue: IO-Compress

People
Owner: Nobody in particular
Requestors: espie [...] nerim.net
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: feature wanted: position in multi-part archives
Date: Wed, 22 Jan 2014 16:29:58 +0100
To: bug-IO-Compress [...] rt.cpan.org
From: Marc Espie <espie [...] nerim.net>
Use case: I'm constructing a multi-part gzip archive: partA.gz partB.gz I need to be able to go *back*, replace the first stream with something else, and then copy the subsequent streams verbatims. So that I end up with partA'.gz partB.gz performance is really an issue. I want to recreate a new file, with a tweaked partA, then and untweaked partB. I can do that by looking at IO::Uncompress::Gunzip internals. e.g., pseudo code would look like: my $fh = IO::Uncompress::Gunzip->new('file.gz'); #(<read first stream>) my $out = IO::Compress::Gzip->new('out.gz'); #(<write modified first stream>) # XXX note location $length = *$fh->{CompSize}->get64bit + *$fh->{Info}{HeaderLength} + *$fh->{Info}{TrailerLength}; $fh->close; $out->close; # copy verbatim file: open($fh, '<', 'file.gz'); $fh->seek($length, 0); open($out, '>>', 'out.gz'); File::Copy::copy($fh, $out); close($fh); close($out); I'd like to have something clean to do that. At the end of a stream, I want to know which position I need to seek to in the *compressed file* to be able to copy the next stream, e.g., the XXX line.
Download (untitled) / with headers
text/plain 1.7k
Hey Marc, here is a way to achieve what you want using the published API. Doubt it will be any quicker than your version, but there is less opening and closing of files needed. cheers Paul #!/usr/bin/perl use warnings; use strict; use IO::Compress::Gzip qw(:all); use IO::Uncompress::Gunzip qw(:all); my $infilename = "/tmp/in.gz"; my $outfilename = "/tmp/out.gz"; my $text = <<EOM; Mary had a little lamb It's fleece was white as snow EOM # Create a gzip sequence to manipulate gzip \$text => $infilename; gzip \$text => $infilename, Append => 1; # From here on is the code that replicates what you are doing. # Start by opening the input & output gzip files using the standard # Perl file I/O open my $OUT, ">$outfilename" or die "Cannot open $outfilename: $!\n"; open my $IN, "<$infilename" or die "Cannot open $infilename: $!\n"; # Now create the gzip & gunzip objects to read & write the first gzip data stream # using the filehandles just created. my $gzin = IO::Uncompress::Gunzip->new($IN) or die "gunzip failed: $GunzipError\n"; my $gzout = IO::Compress::Gzip->new($OUT) or die "gzip failed: $GzipError\n"; while (<$gzin>) { # modify the first gzip stream my $x = uc $_ ; # and output it print $gzout $x } # Finished with the output gzip file, so close it close $gzout; # Very likely the gunzipping of the first stream will # have read a few bytes past the end of the stream, so # write them to the output file print $OUT $gzin->trailingData(); # now done with uncompressing the first gzip stream close $gzin; # From here on it's just standard file I/O. # Use "read" rather than line based I/O for speed. while (read($IN, my $buffer, 1024)) { print $OUT $buffer; } close $IN; close $OUT;
Subject: Re: [rt.cpan.org #92368] feature wanted: position in multi-part archives
Date: Sun, 16 Feb 2014 20:57:00 +0100
To: Paul Marquess via RT <bug-IO-Compress [...] rt.cpan.org>
From: Marc Espie <espie [...] nerim.net>
Download (untitled) / with headers
text/plain 2.2k
On Sun, Feb 16, 2014 at 02:42:49PM -0500, Paul Marquess via RT wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=92368 > > > Hey Marc, > > here is a way to achieve what you want using the published API. Doubt it will be any quicker than your version, but there is less opening and closing of files needed. > > cheers > Paul > > #!/usr/bin/perl > > use warnings; > use strict; > > use IO::Compress::Gzip qw(:all); > use IO::Uncompress::Gunzip qw(:all); > > > my $infilename = "/tmp/in.gz"; > my $outfilename = "/tmp/out.gz"; > > my $text = <<EOM; > Mary had a little lamb > It's fleece was white as snow > EOM > > # Create a gzip sequence to manipulate > gzip \$text => $infilename; > gzip \$text => $infilename, Append => 1; > > > # From here on is the code that replicates what you are doing. > > # Start by opening the input & output gzip files using the standard > # Perl file I/O > open my $OUT, ">$outfilename" > or die "Cannot open $outfilename: $!\n"; > > open my $IN, "<$infilename" > or die "Cannot open $infilename: $!\n"; > > # Now create the gzip & gunzip objects to read & write the first gzip data stream > # using the filehandles just created. > my $gzin = IO::Uncompress::Gunzip->new($IN) > or die "gunzip failed: $GunzipError\n"; > my $gzout = IO::Compress::Gzip->new($OUT) > or die "gzip failed: $GzipError\n"; > > while (<$gzin>) > { > # modify the first gzip stream > my $x = uc $_ ; > # and output it > print $gzout $x > } > > # Finished with the output gzip file, so close it > close $gzout; > > # Very likely the gunzipping of the first stream will > # have read a few bytes past the end of the stream, so > # write them to the output file > print $OUT $gzin->trailingData(); > > # now done with uncompressing the first gzip stream > close $gzin; > > # From here on it's just standard file I/O. > # Use "read" rather than line based I/O for speed. > while (read($IN, my $buffer, 1024)) > { > print $OUT $buffer; > } > > close $IN; > close $OUT;
I assume the same interface will work with bzip2 ? I expect to move to IO::Uncompress::Any soon. Looks good. I'll try that soon enough (not for this release as we're pressed for time) but I think I see how to deal with it. Thanks a bunch, I'll let you know if I hit any snag.
Download (untitled) / with headers
text/plain 340b
... Show quoted text
> > I assume the same interface will work with bzip2 ? I expect to move to > IO::Uncompress::Any soon.
Yes, exactly the same. Show quoted text
> Looks good. > > I'll try that soon enough (not for this release as we're pressed for > time) > but I think I see how to deal with it. > > Thanks a bunch, I'll let you know if I hit any snag.
Sure Paul


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.