[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Problems with sum in textutils
From: |
Jim Meyering |
Subject: |
Re: Problems with sum in textutils |
Date: |
Sat, 27 Oct 2001 18:39:22 +0200 |
User-agent: |
Gnus/5.090004 (Oort Gnus v0.04) Emacs/21.0.107 |
Thanks for the report, but that's not a bug in the newer version.
In 2.0f, I applied this patch:
2000-06-22 Bruno Haible <address@hidden>
* src/sum.c (sysv_sum_file): Avoid overflowing 32-bit accumulator
on files larger than 256 MB.
The only problem is that the above comment is inaccurate.
(I've just fixed it.)
In reality, the problem with overflow using the old version
could happen with files as `small' as 16843010 bytes.
That's floor ((2^32) / 255 + 1).
To demonstrate, remember that the first number in the output of
`sum -s' is the sum of all bytes modulo 0xffff (aka 65535).
So, consider a file that is a sequence of one less than
that magic number of 0xff bytes. We can compute the first
number in sum -s output using bc:
$ echo '(16843009 * 255) % 65535' |bc
0
Do the same, but with one more byte:
$ echo '(16843010 * 255) % 65535' |bc
255
Looks fine, right?
But what happens when we simulate 32-bit two's complement
arithmetic, which makes us reduce the product modulo 2^32:
$ echo '((16843010 * 255) % (2^32)) % 65535' |bc
254
You see we have a different number.
And that is the bug in the old version of GNU sum.
Depending on the width of a long, it would output different results.
The new version uses the code you include below to reduce the
sum modulo 0xffff, so the problem with overflow cannot arise.
Demonstrate that sum works as described above:
$ perl -e 'while (1) {print chr(255) x 300}' |head --bytes=16843010 |sum -s
255 32897
"nick lawes" <address@hidden> wrote:
> I've been looking into a problem that has surfaced on our systems, and
> it turns out that the problem in in the gnu 'sum' utility as shipped
> with RedHat 7.1.
>
> I realise that they have annoyingly shipped an alpha version of
> textutils, but as the problem will become official when this version
> gets released, I felt I should point it out.
>
> The problem is the addition of the line:
>
> /* Reduce checksum mod 0xffff, to avoid overflow. */
> checksum = (checksum & 0xffff) + (checksum >> 16);
>
> Adding (checksum >> 16) makes the number returned for large files (e.g.
> 38MB) incompatible with earlier gnu sums and with system V sum that it
> claims to be compatible with.
>
> I can get around the problem for now by using an older version of sum,
> but this problem will no doubt bite many people when it's released...