Skip Menu |
 

This queue is for tickets about the Scalar-List-Utils CPAN distribution.

Report information
The Basics
Id: 95902
Status: open
Priority: 0/
Queue: Scalar-List-Utils

People
Owner: Nobody in particular
Requestors: DANAJ [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 1.38
Fixed in: (no value)



Subject: sum, min, and max use NV, truncating integers on 64-bit machines
Download (untitled) / with headers
text/plain 1.9k
The sum, min, and max routines use an NV type for operations on non-objects. For platforms with 64-bit UV and 64-bit NV (very common), this results in lost data, e.g.: use List::Util "sum"; my $n = int(2**53); say $n+13; say int(sum($n,13)); There are more examples, e.g. use List::Util "sum"; my $n = ~0 - 3000; say $n; say $n+1000; say int(sum($n,1000)); say int(sum($n,1000,1000,1000,1000,1000)); say int(sum($n, (1000) x 20000 )); noting that in this example we can add 1000 as many times as we like and the result is unchanged. Notes: - With 32-bit UV and 64-bit NV we can't get in trouble. - With Perl 5.6.2 we don't see it because Perl 5.6.2 is a horrible mess with 64-bit -- it internally converts things to NVs all over the place. Thankfully fixed in 5.8+. - Compiling with long double on gcc and x86_64, as long doubles on this platform have 64-bit mantissas. Other compilers and architectures differ, and most people don't compile with long doubles. - Don't sum large numbers or make big sums, and things are fine. I suspect this covers most people's use. - using bigint makes it work fine, at the expense of using bigint (super slow math). A similar issue was brought up a couple years ago in https://rt.cpan.org/Ticket/Display.html?id=77457. Kevin and I both do some number theory, which is where getting inexact integer results causes havoc. The same issue was seen in List::MoreUtils in https://rt.cpan.org/Ticket/Display.html?id=93207 The latter has an example of List::Util's min and max getting things wrong. Note that it was fixed in List::MoreUtils, so perhaps this will contain useful ideas. Opinion: min and max aren't so bad to fix, since we return the SV* from the stack. We need to do more precise comparisons. sum is more problematic since we have to compute the running sum and change types if either we get a new type in input (e.g. sum(10,20,1.6)) or if we overflow. Also watch out for UV vs. IV.
Download (untitled) / with headers
text/plain 262b
Yes; all sounds a bit problematic. I think it should still be possible to build some better logic that keeps a UV, IV and NV and tracks which one is in use, only falling back to the NV if the UV or IV overflows. I'll see what I can come up with -- Paul Evans
Download (untitled) / with headers
text/plain 758b
On Sat May 31 13:27:05 2014, PEVANS wrote: Show quoted text
> Yes; all sounds a bit problematic. I think it should still be possible > to build some better logic that keeps a UV, IV and NV and tracks which > one is in use, only falling back to the NV if the UV or IV overflows. > > I'll see what I can come up with
As a first hack; see what you think to this: https://github.com/Scalar-List-Utils/Scalar-List-Utils/commit/6d873d393d0fd6ab6e9a8bb1a5c47b5d2767be82 For now it only does IVs or NVs, as shortcuts to the full SV. The moment it encounters a UV it'll upgrade that to NV. I think it could be further improved to work in UVs as well, though this makes it trickier to handle negative values. I'll have a go at putting similar in max/min as well. -- Paul Evans
Download (untitled) / with headers
text/plain 834b
For what it's worth, I also just ran into this bug. I've "fixed" it in our code base by using reduce { $a + $b } instead of sum(). DB<31> x sum(644288760976000000, 645589752196000000, 10213291038976000000, 618072702976000000, 645589752196000000) 0 '-5.67991206638955162e+18' DB<39> x reduce { $a + $b } (644288760976000000, 645589752196000000, 10213291038976000000, 618072702976000000, 645589752196000000) 0 12766832007320000000 ---------- An especially funky result is that you can get different results on different orderings of input: DB<43> x sum @a; 0 '-5.67991206638955162e+18' DB<44> x sum sort @a; 0 '1.276683200732e+19' Note that the latter gives the correct result (1.276683200732e+19 == 12766832007320000000) Super weird. I can't comment on the proposed fix I'm afraid - the test case looks good though :-)


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.