[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#7357: csplit: memory exhausted when using stdout / pipe instead of a
From: |
Jim Meyering |
Subject: |
bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file |
Date: |
Thu, 11 Nov 2010 10:24:18 +0100 |
Paul Eggert wrote:
> On 11/10/10 10:49, Eric Blake wrote:
>> Actually, the next version of POSIX will require EOVERFLOW if printf is
>> directed to print more than INT_MAX bytes:
>
> Thanks, I didn't know that. But this issue is slightly different.
> For example, glibc (RHEL 5 x86-64) treats printf ("%4294967297d", 0)
> as if it were printf ("%1d", 0), and returns 1 without setting
> EOVERFLOW. I expect that this bug is common, and our code should be
> robust when it happens.
>
> There are other printf problems in this area too, I'm afraid. max_out
> does not check for size_t overflow and can mess up big-time when it
> happens. And get_format_flags does not reject flag combinations that
> lead to undefined behavior. There are other issues with size_t overflow.
> It's no doubt possible to get a buffer overflow on some platforms.
>
> I scouted through the code looking for undefined behavior with printf
> formats in csplit and fixed everything I found. I propose the
> following further patch to address them.
>
>>From 2b1f077d355b56b49c82f4f5223ee76a4f1b3396 Mon Sep 17 00:00:00 2001
> From: Paul Eggert <address@hidden>
> Date: Wed, 10 Nov 2010 20:34:52 -0800
> Subject: [PATCH] csplit: do not rely on undefined behavior in printf formats
>
> * doc/coreutils.texi (csplit invocation): Say that %d and %i are
> aliases for %u.
> * src/csplit.c (FLAG_THOUSANDS, FLAG_ALTERNATIVE): New constants.
> (get_format_flags): Now take char const * and int * and return
> size_t. It now stores info about the flags instead of merely
> scanning them. Also, it handles '0' correctly. Drop support for
> the undocumented '+' and ' ' flags since the value is unsigned.
> Add support for the (undocumented) "'" flag. All uses changed.
> (get_format_width, get_format_prec): Remove.
> (check_format_conv_type): Renamed from get_format_conv_type, with
> a different signature. It now converts the format to one that is
> compatible with unsigned int, and checks flags. All uses changed.
> (max_out): Have snprintf compute the number of bytes needed rather
> than attempting to do it ourselves (which doesn't work portably
> with outlandish formats such as %4294967296d).
> (check_format_conv_type, main): Check for overflow in size
> calculations. Don't assume size_t fits in unsigned int.
> * tests/misc/csplit: Check for proper handling of flags, with
> %0#6.3x. Coreutils 8.6 mishandles this somewhat-weird example.
...
Thanks for the nice patch.
I will push it shortly.
Note however, that using glibc's snprintf results in
potentially large heap allocation, up to 2GiB:
$ seq 1000 | valgrind src/csplit -b %2294967296d - '/./' '{*}'
==10741== Memcheck, a memory error detector
==10741== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==10741== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==10741== Command: src/csplit -b %2294967296d - /./ {*}
==10741==
==10741== Warning: silly arg (-1999999968) to malloc()
src/csplit: memory exhausted
That happens here, where format is "%2294967296d":
1292 int maxlen = snprintf (NULL, 0, format, UINT_MAX);
1293 if (! (0 <= maxlen && maxlen <= SIZE_MAX))
1294 xalloc_die ();
snprintf tries to allocate space in which to format the result
While testing this, I ran this:
echo a|src/csplit -b %02147483647d - '/./' '{*}' > k 2> err
and it took a long time. Understandable when you realize it is
generating a 4GiB-long diagnostic, all on one line:
src/csplit: <2GiB of spaces>xx<2GiB of 0's>: File name too long
This was on a Fedora 14 x86_64 system.
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, (continued)
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Pádraig Brady, 2010/11/10
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Jim Meyering, 2010/11/10
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Pádraig Brady, 2010/11/10
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Jim Meyering, 2010/11/10
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Pádraig Brady, 2010/11/10
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Bob Proulx, 2010/11/10
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Jim Meyering, 2010/11/11
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Paul Eggert, 2010/11/10
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Eric Blake, 2010/11/10
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Paul Eggert, 2010/11/10
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file,
Jim Meyering <=
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Voelker, Bernhard, 2010/11/11
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Jim Meyering, 2010/11/11
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Jim Meyering, 2010/11/10
- bug#7357: csplit: memory exhausted when using stdout / pipe instead of a file, Jim Meyering, 2010/11/10