Skip Menu |
 

This queue is for tickets about the IO-Compress CPAN distribution.

Report information
The Basics
Id: 92368
Status: open
Priority: 0/
Queue: IO-Compress

People
Owner: Nobody in particular
Requestors: espie [...] nerim.net
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



From espie [...] nerim.net Wed Jan 22 10: 29:17 2014
MIME-Version: 1.0
X-Spam-Status: No, score=-2.021 tagged_above=-99.9 required=10 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RP_MATCHES_RCVD=-0.121] autolearn=ham
Content-Disposition: inline
X-Spam-Flag: NO
content-type: text/plain; charset="utf-8"
Reply-To: espie [...] nerim.net
Message-ID: <20140122152958.GA32759 [...] lain.home>
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
X-Virus-Scanned: amavisd-new at nerim.net
X-Spam-Score: -2.021
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id 464C9240537 for <cpan-bug+IO-Compress [...] hipster.bestpractical.com>; Wed, 22 Jan 2014 10:29:17 -0500 (EST)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id knL8AmRqVlEJ for <cpan-bug+IO-Compress [...] hipster.bestpractical.com>; Wed, 22 Jan 2014 10:29:15 -0500 (EST)
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id B243A2403C9 for <bug-IO-Compress [...] rt.cpan.org>; Wed, 22 Jan 2014 10:29:15 -0500 (EST)
Received: (qmail 22064 invoked by alias); 22 Jan 2014 15:29:15 -0000
Received: from bamako.nerim.net (HELO bamako.nerim.net) (178.132.17.28) by la.mx.develooper.com (qpsmtpd/0.28) with ESMTP; Wed, 22 Jan 2014 07:29:13 -0800
Received: from localhost (localhost [127.0.0.1]) by bamako.nerim.net (Postfix) with ESMTP id BA3B939DE9E for <bug-IO-Compress [...] rt.cpan.org>; Wed, 22 Jan 2014 16:29:26 +0100 (CET)
Received: from bamako.nerim.net ([127.0.0.1]) by localhost (bamako.nerim.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id T-CmV2wSt5Oo for <bug-IO-Compress [...] rt.cpan.org>; Wed, 22 Jan 2014 16:29:26 +0100 (CET)
Received: from espie.gentiane.org (espie.net8.nerim.net [213.41.185.88]) by bamako.nerim.net (Postfix) with ESMTPS id D5A4D39DE81 for <bug-IO-Compress [...] rt.cpan.org>; Wed, 22 Jan 2014 16:29:25 +0100 (CET)
Received: from lain.home (espie [...] localhost.home [127.0.0.1]) by espie.gentiane.org (8.14.5/8.12.11) with ESMTP id s0MFTwIV005476 for <bug-IO-Compress [...] rt.cpan.org>; Wed, 22 Jan 2014 16:29:58 +0100 (CET)
Received: (from espie [...] localhost) by lain.home (8.14.5/8.14.5/Submit) id s0MFTwWU021447 for bug-IO-Compress [...] rt.cpan.org; Wed, 22 Jan 2014 16:29:58 +0100 (CET)
Delivered-To: cpan-bug+IO-Compress [...] hipster.bestpractical.com
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: feature wanted: position in multi-part archives
Return-Path: <espie [...] nerim.net>
X-RT-Mail-Extension: io-compress
X-Original-To: cpan-bug+IO-Compress [...] hipster.bestpractical.com
X-Spam-Check-BY: la.mx.develooper.com
Date: Wed, 22 Jan 2014 16:29:58 +0100
X-Spam-Level:
To: bug-IO-Compress [...] rt.cpan.org
From: Marc Espie <espie [...] nerim.net>
X-RT-Original-Encoding: ascii
X-RT-Interface: Email
Content-Length: 1107
Use case: I'm constructing a multi-part gzip archive: partA.gz partB.gz I need to be able to go *back*, replace the first stream with something else, and then copy the subsequent streams verbatims. So that I end up with partA'.gz partB.gz performance is really an issue. I want to recreate a new file, with a tweaked partA, then and untweaked partB. I can do that by looking at IO::Uncompress::Gunzip internals. e.g., pseudo code would look like: my $fh = IO::Uncompress::Gunzip->new('file.gz'); #(<read first stream>) my $out = IO::Compress::Gzip->new('out.gz'); #(<write modified first stream>) # XXX note location $length = *$fh->{CompSize}->get64bit + *$fh->{Info}{HeaderLength} + *$fh->{Info}{TrailerLength}; $fh->close; $out->close; # copy verbatim file: open($fh, '<', 'file.gz'); $fh->seek($length, 0); open($out, '>>', 'out.gz'); File::Copy::copy($fh, $out); close($fh); close($out); I'd like to have something clean to do that. At the end of a stream, I want to know which position I need to seek to in the *compressed file* to be able to copy the next stream, e.g., the XXX line.
MIME-Version: 1.0
In-Reply-To: <20140122152958.GA32759 [...] lain.home>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <20140122152958.GA32759 [...] lain.home>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-10149-1392579769-1029.92368-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 1758
Download (untitled) / with headers
text/plain 1.7k
Hey Marc, here is a way to achieve what you want using the published API. Doubt it will be any quicker than your version, but there is less opening and closing of files needed. cheers Paul #!/usr/bin/perl use warnings; use strict; use IO::Compress::Gzip qw(:all); use IO::Uncompress::Gunzip qw(:all); my $infilename = "/tmp/in.gz"; my $outfilename = "/tmp/out.gz"; my $text = <<EOM; Mary had a little lamb It's fleece was white as snow EOM # Create a gzip sequence to manipulate gzip \$text => $infilename; gzip \$text => $infilename, Append => 1; # From here on is the code that replicates what you are doing. # Start by opening the input & output gzip files using the standard # Perl file I/O open my $OUT, ">$outfilename" or die "Cannot open $outfilename: $!\n"; open my $IN, "<$infilename" or die "Cannot open $infilename: $!\n"; # Now create the gzip & gunzip objects to read & write the first gzip data stream # using the filehandles just created. my $gzin = IO::Uncompress::Gunzip->new($IN) or die "gunzip failed: $GunzipError\n"; my $gzout = IO::Compress::Gzip->new($OUT) or die "gzip failed: $GzipError\n"; while (<$gzin>) { # modify the first gzip stream my $x = uc $_ ; # and output it print $gzout $x } # Finished with the output gzip file, so close it close $gzout; # Very likely the gunzipping of the first stream will # have read a few bytes past the end of the stream, so # write them to the output file print $OUT $gzin->trailingData(); # now done with uncompressing the first gzip stream close $gzin; # From here on it's just standard file I/O. # Use "read" rather than line based I/O for speed. while (read($IN, my $buffer, 1024)) { print $OUT $buffer; } close $IN; close $OUT;
From espie [...] nerim.net Sun Feb 16 14: 55:18 2014
MIME-Version: 1.0
X-Spam-Status: No, score=-2.228 tagged_above=-99.9 required=10 tests=[AWL=0.207, BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RP_MATCHES_RCVD=-0.535] autolearn=ham
In-Reply-To: <rt-4.0.18-10149-1392579769-1073.92368-6-0 [...] rt.cpan.org>
Content-Disposition: inline
X-Spam-Flag: NO
X-RT-Interface: API
References: <RT-Ticket-92368 [...] rt.cpan.org> <20140122152958.GA32759 [...] lain.home> <rt-4.0.18-10149-1392579769-1073.92368-6-0 [...] rt.cpan.org>
X-Virus-Scanned: Debian amavisd-new at bestpractical.com
X-Virus-Scanned: amavisd-new at nerim.net
Message-ID: <20140216195700.GA10198 [...] lain.home>
Reply-To: espie [...] nerim.net
content-type: text/plain; charset="utf-8"
X-RT-Original-Encoding: utf-8
X-Spam-Score: -2.228
Received: from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id EF15124054F for <cpan-bug+IO-Compress [...] hipster.bestpractical.com>; Sun, 16 Feb 2014 14:55:17 -0500 (EST)
Received: from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cF8+HgUSdlhm for <cpan-bug+IO-Compress [...] hipster.bestpractical.com>; Sun, 16 Feb 2014 14:55:16 -0500 (EST)
Received: from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id 72774240547 for <bug-IO-Compress [...] rt.cpan.org>; Sun, 16 Feb 2014 14:55:15 -0500 (EST)
Received: (qmail 3586 invoked by alias); 16 Feb 2014 19:55:15 -0000
Received: from bamako.nerim.net (HELO bamako.nerim.net) (178.132.17.28) by la.mx.develooper.com (qpsmtpd/0.28) with ESMTP; Sun, 16 Feb 2014 11:55:12 -0800
Received: from localhost (localhost [127.0.0.1]) by bamako.nerim.net (Postfix) with ESMTP id 4130239DEED for <bug-IO-Compress [...] rt.cpan.org>; Sun, 16 Feb 2014 20:55:24 +0100 (CET)
Received: from bamako.nerim.net ([127.0.0.1]) by localhost (bamako.nerim.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Zak+ongJG+Jo for <bug-IO-Compress [...] rt.cpan.org>; Sun, 16 Feb 2014 20:55:22 +0100 (CET)
Received: from espie.gentiane.org (espie.net8.nerim.net [213.41.185.88]) by bamako.nerim.net (Postfix) with ESMTPS id 5FE4A39DE99 for <bug-IO-Compress [...] rt.cpan.org>; Sun, 16 Feb 2014 20:55:22 +0100 (CET)
Received: from lain.home (espie [...] localhost.home [127.0.0.1]) by espie.gentiane.org (8.14.5/8.12.11) with ESMTP id s1GJv14s019242 for <bug-IO-Compress [...] rt.cpan.org>; Sun, 16 Feb 2014 20:57:01 +0100 (CET)
Received: (from espie [...] localhost) by lain.home (8.14.5/8.14.5/Submit) id s1GJv0ia031699 for bug-IO-Compress [...] rt.cpan.org; Sun, 16 Feb 2014 20:57:00 +0100 (CET)
Delivered-To: cpan-bug+IO-Compress [...] hipster.bestpractical.com
Subject: Re: [rt.cpan.org #92368] feature wanted: position in multi-part archives
User-Agent: Mutt/1.5.21 (2010-09-15)
Return-Path: <espie [...] nerim.net>
X-Spam-Check-BY: la.mx.develooper.com
X-Original-To: cpan-bug+IO-Compress [...] hipster.bestpractical.com
X-RT-Mail-Extension: io-compress
Date: Sun, 16 Feb 2014 20:57:00 +0100
X-Spam-Level:
To: Paul Marquess via RT <bug-IO-Compress [...] rt.cpan.org>
From: Marc Espie <espie [...] nerim.net>
RT-Message-ID: <rt-4.0.18-16186-1392580518-683.92368-0-0 [...] rt.cpan.org>
Content-Length: 2318
Download (untitled) / with headers
text/plain 2.2k
On Sun, Feb 16, 2014 at 02:42:49PM -0500, Paul Marquess via RT wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=92368 > > > Hey Marc, > > here is a way to achieve what you want using the published API. Doubt it will be any quicker than your version, but there is less opening and closing of files needed. > > cheers > Paul > > #!/usr/bin/perl > > use warnings; > use strict; > > use IO::Compress::Gzip qw(:all); > use IO::Uncompress::Gunzip qw(:all); > > > my $infilename = "/tmp/in.gz"; > my $outfilename = "/tmp/out.gz"; > > my $text = <<EOM; > Mary had a little lamb > It's fleece was white as snow > EOM > > # Create a gzip sequence to manipulate > gzip \$text => $infilename; > gzip \$text => $infilename, Append => 1; > > > # From here on is the code that replicates what you are doing. > > # Start by opening the input & output gzip files using the standard > # Perl file I/O > open my $OUT, ">$outfilename" > or die "Cannot open $outfilename: $!\n"; > > open my $IN, "<$infilename" > or die "Cannot open $infilename: $!\n"; > > # Now create the gzip & gunzip objects to read & write the first gzip data stream > # using the filehandles just created. > my $gzin = IO::Uncompress::Gunzip->new($IN) > or die "gunzip failed: $GunzipError\n"; > my $gzout = IO::Compress::Gzip->new($OUT) > or die "gzip failed: $GzipError\n"; > > while (<$gzin>) > { > # modify the first gzip stream > my $x = uc $_ ; > # and output it > print $gzout $x > } > > # Finished with the output gzip file, so close it > close $gzout; > > # Very likely the gunzipping of the first stream will > # have read a few bytes past the end of the stream, so > # write them to the output file > print $OUT $gzin->trailingData(); > > # now done with uncompressing the first gzip stream > close $gzin; > > # From here on it's just standard file I/O. > # Use "read" rather than line based I/O for speed. > while (read($IN, my $buffer, 1024)) > { > print $OUT $buffer; > } > > close $IN; > close $OUT;
I assume the same interface will work with bzip2 ? I expect to move to IO::Uncompress::Any soon. Looks good. I'll try that soon enough (not for this release as we're pressed for time) but I think I see how to deal with it. Thanks a bunch, I'll let you know if I hit any snag.
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-16186-1392580518-683.92368-0-0 [...] rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <RT-Ticket-92368 [...] rt.cpan.org> <20140122152958.GA32759 [...] lain.home> <rt-4.0.18-10149-1392579769-1073.92368-6-0 [...] rt.cpan.org> <20140216195700.GA10198 [...] lain.home> <rt-4.0.18-16186-1392580518-683.92368-0-0 [...] rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-17053-1392590633-1067.92368-0-0 [...] rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 340
Download (untitled) / with headers
text/plain 340b
... Show quoted text
> > I assume the same interface will work with bzip2 ? I expect to move to > IO::Uncompress::Any soon.
Yes, exactly the same. Show quoted text
> Looks good. > > I'll try that soon enough (not for this release as we're pressed for > time) > but I think I see how to deal with it. > > Thanks a bunch, I'll let you know if I hit any snag.
Sure Paul


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.