Skip Menu |
 

This queue is for tickets about the PDF-API2 CPAN distribution.

Report information
The Basics
Id: 48683
Status: open
Priority: 0/
Queue: PDF-API2

People
Owner: Nobody in particular
Requestors: cherdt [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.73
Fixed in: (no value)



Subject: PDF-API2 throws error "Malformed xref in PDF file" for newer PDFs (PDF v1.5+)
Download (untitled) / with headers
text/plain 547b
Although PDF-API2 works well for older PDF formats (1.4), newer PDF formats (1.5, 1.6) cause it to throw error an error message: "Malformed xref in PDF file at [path to File.pm] line 1198" I can reproduce the error with the following Perl script: use PDF::API2; $pdf = PDF::API2->open($ARGV[0]); The PDFs I am using were produced by Adobe Acrobat Pro 9 (the attached is an example). I saw the possibly related bug report submitted by abhinavk, but his fix did not work for me. Environment: Perl v5.10.0 (ActiveState) on WinXP Pro 2002 SP3.
Subject: HowToArgueEffectively.pdf
Download HowToArgueEffectively.pdf
application/pdf 14.9k

Message body not shown because it is not plain text.

Subject: [rt.cpan.org #48683]
Date: Fri, 14 Aug 2009 11:41:47 -0500
To: bug-PDF-API2 [...] rt.cpan.org
From: Chris Herdt <cherdt [...] gmail.com>
Download (untitled) / with headers
text/plain 283b
I have since successfully used PDF::API2 to modify version 1.6 PDFs, so I'm not certain why the particular PDF in question produced the error. Perhaps that file in particular has an unfriendly xref value, which is apparently modified or removed when saved as an earlier PDF version.
Download (untitled) / with headers
text/plain 860b
On Fri Aug 14 12:42:10 2009, cherdt wrote: Show quoted text
> I have since successfully used PDF::API2 to modify version 1.6 PDFs, > so I'm not certain why the particular PDF in question produced the > error. Perhaps that file in particular has an unfriendly xref value, > which is apparently modified or removed when saved as an earlier PDF > version.
Since PDF 1.5, the spec changed to allow xref information to be in streams instead of tables. This isn't supported by PDF::API2 (though I'll be very happy if someone beats me to fixing that and sends a patch!). Acrobat 9 started using cross-reference streams by default, so this error is more common with newer files. PDF::API2 will work fine if you generate a PDF in Acrobat 9 without using cross-reference streams, however. The easiest way to do this is to make it compatible with Acrobat 5.0 and later when you save.
From: pwomack [...] papermule.co.uk
Download (untitled) / with headers
text/plain 577b
Show quoted text
> > Acrobat 9 started using cross-reference streams by default, so this > error is more common with newer files. PDF::API2 will work fine if you > generate a PDF in Acrobat 9 without using cross-reference streams, > however. The easiest way to do this is to make it compatible with > Acrobat 5.0 and later when you save.
With the passage of time, these are becoming more common. I looked into adding the cross-reference stream myself, but it is too complex to be a "patch"; it'a a significant piece of implementation. So can I just "+1" the importance of this. BugBear
Download (untitled) / with headers
text/plain 437b
Version 2.020, just released, contains two updates relevant to this issue: 1) If PDF::API2 encounters a cross-reference stream, it will now give a more appropriate error message rather than saying that the cross-reference table is malformed. 2) The Known Issues section of the POD contains pointers to the PDF specification, which describes how both the old cross-reference table works and how the new cross-reference streams work.
From: don.huettl [...] grantstreet.com
Download (untitled) / with headers
text/plain 548b
I have attached three patches to implement read-only support for cross-reference streams and compressed objects. Saving the results will still write a v1.4 document. Patches should be applied in the following order: PDF-API2-2.023-XRefStm.patch PDF-API2-2.023-Predictor.patch PDF-API2-2.023-XRef-test.patch The unit test that I added needs the example document attached to this ticket to be saved as t/resources/HowToArgueEffectively.pdf. If this is applied, please credit my employer, Grant Street Group <gsg@cpan.org>, in addition to myself.
Subject: PDF-API2-2.023-XRefStm.patch

Message body is not shown because it is too large.

Subject: PDF-API2-2.023-XRef-test.patch
commit c198a9745c7a Author: Don Huettl <don.huettl@grantstreet.com> Date: Thu Mar 19 15:28:53 2015 -0400 tests to validate cross-reference stream logic Adds a reference PDF document containing XRef streams, and the associated unit tests. diff --git a/PDF-API2/t/resources/HowToArgueEffectively.pdf b/PDF-API2/t/resources/HowToArgueEffectively.pdf new file mode 100644 index 000000000000..8bfd9482b940 Binary files /dev/null and b/PDF-API2/t/resources/HowToArgueEffectively.pdf differ diff --git a/PDF-API2/t/xref.t b/PDF-API2/t/xref.t new file mode 100644 index 000000000000..0280251f76b3 --- /dev/null +++ b/PDF-API2/t/xref.t @@ -0,0 +1,27 @@ +use Test::More tests => 2; + +use warnings; +use strict; + +use PDF::API2; + +my $pdf = eval { + PDF::API2->open('t/resources/HowToArgueEffectively.pdf'); +}; + +isa_ok($pdf, 'PDF::API2', q{doc containing an XRef stream}); + +my $file = $pdf->{pdf}; +my $pass = 1; + +while (my($id, $xref) = each %{$file->{' xref'}}) { + my $obj = $file->read_objnum($id, $xref->[1]); + + unless (ref($obj)) { + $pass = 0; + last; + } +} + +ok($pass, 'all XRef entries point to an object'); +
Subject: PDF-API2-2.023-Predictor.patch

Message body is not shown because it is too large.

From: don.huettl [...] grantstreet.com
I have one more patch that does a little clean-up, attached.
Subject: PDF-API2-Predictor-pt2.patch
diff --git PDF-API2/lib/PDF/API2/Basic/PDF/Filter/Predictor.pm PDF-API2/lib/PDF/API2/Basic/PDF/Filter/Predictor.pm index 7d2c388dcfc0..813951d0f6fc 100644 --- PDF-API2/lib/PDF/API2/Basic/PDF/Filter/Predictor.pm +++ PDF-API2/lib/PDF/API2/Basic/PDF/Filter/Predictor.pm @@ -22,8 +22,7 @@ sub new { sub outfilt { my ($self) = @_; - warn 'The "outfilt" method is not implemented'; - return; + die 'The "outfilt" method is not implemented'; } sub infilt { @@ -44,7 +43,7 @@ sub infilt { } elsif ($predictor >= 10 && $predictor <= 15) { $self->_depredict_png; } else { - warn "Invalid predictor: $predictor"; + die "Invalid predictor: $predictor"; } return $obj->{' stream'}; @@ -133,7 +132,7 @@ sub _depredict_png { sub _depredict_tiff { my ($self) = @_; - warn "The TIFF predictor logic has not been implemented"; + die "The TIFF predictor logic has not been implemented"; } 1; diff --git PDF-API2/lib/PDF/API2/Resource/XObject/Image/PNG.pm PDF-API2/lib/PDF/API2/Resource/XObject/Image/PNG.pm index bdf3356a9f8d..3fd5832cb675 100644 --- PDF-API2/lib/PDF/API2/Resource/XObject/Image/PNG.pm +++ PDF-API2/lib/PDF/API2/Resource/XObject/Image/PNG.pm @@ -8,6 +8,7 @@ use POSIX qw(ceil); use IO::File; use PDF::API2::Util; +use PDF::API2::Basic::PDF::Filter::Predictor; use PDF::API2::Basic::PDF::Utils; no warnings qw[ deprecated recursion uninitialized ]; @@ -31,7 +32,10 @@ sub new { open($fh,$file); binmode($fh); seek($fh,8,0); + $self->{Length}=PDFNum(-s $file); $self->{' stream'}=''; + $self->{' streamloc'}=0; + $self->{' streamsrc'}=$fh; $self->{' nofilt'}=1; while(!eof($fh)) { read($fh,$buf,4);


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.