MIME-Version: | 1.0 |
X-Spam-Status: | No, score=-1.998 tagged_above=-99.9 required=10 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham |
X-Spam-Flag: | NO |
Content-Type: | multipart/alternative; boundary="94eb2c062a383d02d9053f3635a1" |
Message-ID: | <CAP19m1f--NyRfB9v_Af3zydRYOm2mRvAor_Tqsn9tj=681EtaQ [...] mail.gmail.com> |
X-Received: | by 10.31.238.74 with SMTP id m71mr4656085vkh.27.1476876937526; Wed, 19 Oct 2016 04:35:37 -0700 (PDT) |
X-Virus-Scanned: | Debian amavisd-new at bestpractical.com |
X-Spam-Score: | -1.998 |
Received: | from localhost (localhost [127.0.0.1]) by hipster.bestpractical.com (Postfix) with ESMTP id B8C3A240279 for <cpan-bug+Apache-Tika [...] hipster.bestpractical.com>; Wed, 19 Oct 2016 07:35:50 -0400 (EDT) |
Received: | from hipster.bestpractical.com ([127.0.0.1]) by localhost (hipster.bestpractical.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tn5B1qQ4J9tb for <cpan-bug+Apache-Tika [...] hipster.bestpractical.com>; Wed, 19 Oct 2016 07:35:47 -0400 (EDT) |
Received: | from la.mx.develooper.com (x1.develooper.com [207.171.7.70]) by hipster.bestpractical.com (Postfix) with SMTP id B1637240028 for <bug-Apache-Tika [...] rt.cpan.org>; Wed, 19 Oct 2016 07:35:45 -0400 (EDT) |
Received: | (qmail 26712 invoked by alias); 19 Oct 2016 11:35:44 -0000 |
Received: | from mail-vk0-f50.google.com (HELO mail-vk0-f50.google.com) (209.85.213.50) by la.mx.develooper.com (qpsmtpd/0.28) with ESMTP; Wed, 19 Oct 2016 04:35:42 -0700 |
Received: | by mail-vk0-f50.google.com with SMTP id b186so23645623vkb.1 for <bug-Apache-Tika [...] rt.cpan.org>; Wed, 19 Oct 2016 04:35:41 -0700 (PDT) |
Received: | by 10.176.69.141 with HTTP; Wed, 19 Oct 2016 04:35:37 -0700 (PDT) |
Authentication-Results: | hipster.bestpractical.com (amavisd-new); dkim=pass header.i= [...] gmail.com |
Delivered-To: | cpan-bug+Apache-Tika [...] hipster.bestpractical.com |
Subject: | CommonsDigester calculates wrong hashes on large files |
Return-Path: | <yahavamsi [...] gmail.com> |
X-RT-Mail-Extension: | apache-tika |
X-Original-To: | cpan-bug+Apache-Tika [...] hipster.bestpractical.com |
X-Spam-Check-BY: | la.mx.develooper.com |
Dkim-Signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=AsC5Fiq1P9SlmWSXNm8KDR1Awr++wLdvV/aJgECCgl8=; b=xn34aXK7V7rMQMdRkjj9X6rOGDUBhXXVT4FzHOeZraesd1CbZF0oR5AOEZKNnACzdE aISGtMORUsABj3Rm/IFeLRKIlV+H7zYXm7niyb7r5hBgxGwOFPAzAF9XKqSVDLQT5+2/ I9bVTZEMbuSj7IKNyqcvF+MX5w9ADpnjrHHpGa0FmOIJEATqF65ta2ar5r+tNj1F+eyX 9xqThO5oCRDucUDRU3DaAVe51YjYDAjIUGLQskSHg1dCe7Lbv/5OgnFRls1RgmhL4XZg q5RvTqkw2sEDG+i/rARhi9w6KusxzDqWuexxdA/A6nQs/sVHQITdGt6q60TdokCKhvG8 OX/g== |
X-Google-Dkim-Signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=AsC5Fiq1P9SlmWSXNm8KDR1Awr++wLdvV/aJgECCgl8=; b=FvF6UG24K1UCUjccYNVPcnq2lVNLxcaWkoXLfNcPef/KZp48JyUYJk9bCKyjRlT0ro M2eEQwA000NjPR4yEaLZynWiQj9vTncbkzUVuOdf/uJrRDY/5nINuBONMRf1PkHfeGzf XEU3hAXfX21gQhxKfgNBWCqyhVmZwysm7oSTiivantRkm+4+d/CZOWM1FMIvUpsjgiou Mqft9lWyOS1hv1nP1f9ZIVEyE9cdSQ4ItzTQgsZdXjOG+KEYxeg4dtIe+z22qt7yScWB pA14UAo/NJDFfYQJAQXZNGvnUfUlf9OZ/Ewc3o93G/Pfj4I39CGvPNqGsys3cjWVEJm1 ALiw== |
Date: | Wed, 19 Oct 2016 14:35:37 +0300 |
X-Spam-Level: | |
To: | bug-Apache-Tika [...] rt.cpan.org |
From: | Yahav Amsalem <yahavamsi [...] gmail.com> |
X-GM-Message-State: | AA6/9RkxkRUfC3KZaePja8ECbu1/O18ej4UM+rqbCdFcUXqmfdBlkTikmqDJXfj0i2hYayLdx7luS1yiVudXjw== |
X-RT-Interface: | |
Content-Length: | 0 |
content-type: | text/plain; charset="utf-8" |
X-RT-Original-Encoding: | utf-8 |
Content-Length: | 1200 |
Hi,
I would like to report the next bug description:
When passing more than one algorithm to CommonsDigester constructor and
then trying to digest a file which is larger than 7.5 MB, results wrong
hashe calculation for all the algorithms except the first.
The next code will reproduce the bug:
*// The file that was used was a simple plain text file with size > 7.5 MB*
*File file = new File("c:\\testLargeFile.txt");*
*BufferedInputStream bufferedInputStream = new BufferedInputStream(new
FileInputStream(file));*
*Metadata metadata = new Metadata();*
*CommonsDigester digester = new CommonsDigester(20000000,*
* CommonsDigester.DigestAlgorithm.MD5,*
* CommonsDigester.DigestAlgorithm.SHA1,*
* CommonsDigester.DigestAlgorithm.SHA256);*
*digester.digest(bufferedInputStream, metadata, null);*
*// Will print correct MD5 but wrong SHA1 and wrong SHA256*
*System.out.println(metadata);*
Initial direction: from a little research it seems that the inner buffered
stream that is being used doesn't reset to 0 position after the first
algorithm.
If there are any further questions I would be happy to deliver more details.
Thanks,
Yahav Amsalem
content-type: | text/html; charset="utf-8" |
Content-Transfer-Encoding: | quoted-printable |
X-RT-Original-Encoding: | utf-8 |
Content-Length: | 1797 |