lzip-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lzip-bug] Version 0.1 of plzip released


From: John Reiser
Subject: Re: [Lzip-bug] Version 0.1 of plzip released
Date: Wed, 09 Dec 2009 12:53:21 -0800
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.4pre) Gecko/20091014 Fedora/3.0-2.8.b4.fc11 Thunderbird/3.0b4

On 12/08/2009 04:56 PM, Ersek Laszlo wrote:
On 07 Dec 2009, John Reiser wrote:

On 12/07/2009 12:07 AM, Tino Lange wrote:
plzip.cc:601: warning: comparison is always false due to limited
range of data type
main.cc:1101: warning: comparison is always false due to limited
range of data type
Are they serious?

Yes, they are true bugs.
Using "(size_t)-1" instead of INT32_MAX or INT64_MAX (from<stdint.h>)
is an outright error.

Hi, I wrote llzip. I'm not trying to hijack this thread, but since the
code in question is from llzip-0.03, and consequently, lbzip2, I'd like
to add a few notes.


main.cc:1101: warning: comparison is always false due to limited
range of data type

This is not a bug. ...
   [Long detailed argument snipped for brevity]


Version 0.1 of plzip uses a ./configure script to generate ./Makefile.
The configure could pre-compute the various sizeof(<type>) that are involved
in the overflow checks [actually, guarding against resource exhaustion]
(such as determining target class LP64 ['long' and pointer are 64-bit,
'int' is 32-bit]), or such checks could be made explicitly in the *.cc
files.  Either way, any test which deals with size-based overflow, and whose
results are pre-determined solely on the basis of the types involved, should
take advantage of those facts. If constant values that result from analysis
of types and type ranges are not acknowledged and/or not propagated, then
any resulting compiler warnings are maintenance bugs.  It is a bug that
someone who compiles the software should have to consider, "Is the compiler
warning valid?  Should I ignore the compiler warning?"  The absolute minimum
requirement is a section in a README or RELEASE-NOTES file which lists
the compiler warnings and explains them.  It is much better to avoid the
warnings entirely.

Turning to the actual code main.cc [with line numbers of plzip-0.1]:
   1091   long max_worker = sysconf(_SC_THREAD_THREADS_MAX);
   1092   if (-1L == max_worker) {
   1093     max_worker = LONG_MAX;
   1094   }
   1095   if (UINT_MAX < (long unsigned)max_worker) {
   1096     max_worker = UINT_MAX;
   1097   }
   1098   if (UINT_MAX / blf < (unsigned)max_worker) {
   1099     max_worker = UINT_MAX / blf;
   1100   }
   1101   if ((size_t)-1 / sizeof(pthread_t) < (unsigned)max_worker) {
   1102     max_worker = (size_t)-1 / sizeof(pthread_t);
   1103   }

This part of code establishes an absolute upper limit on the number of
worker threads.  ...   The value of max_worker can
only decrease, starting with line 1095, and the safety of any
conditional itself is provided for by the earlier parts.

On the usual x86_64, which has 64-bit 'long' and 32-bit 'unsigned' and
64-bit 'size_t', then the test on line 1101 always is false, as warned
by gcc-4.4.1 and others.  The left-hand side "(size_t)-1 / sizeof(pthread_t)"
has the compile-time constant value 0x1ffffffffffffffful, and the maximum value
of any 'unsigned' on the right-hand side is 0xffffffffu.  The test never
can succeed, regardless of the runtime value of max_worker, and this fact
is known at compile time and at programming time.  Therefore lines 1101-1103
should be removed by conditional compilation in the usual x86_64 environment.
Not doing so is a maintenance bug because common compilers warn,
and dealing with the warning wastes people time.

I would welcome an explicit initialized constant such as
   size_t const SIZE_T_MAX = ~(size_t)0;
   size_t const SIZE_T_MAX(~(size_t)0);   // perhaps better for C++
and then use SIZE_T_MAX as appropriate instead of "(size_t)-1".
I have programmed for some one's-complement machines, where
-1 ==> 0xff...ffE  which is not the same as
~0 ==> 0xff...ffF .


About the same is going on around plzip.cc:601.

No, this is worse.  Included below are explicit numerical counterexamples
which demonstrate bugs in programming logic.

The source line is
  601  SIZES(S2w_blk, plain, 2u * compr_lev[clidx].dict_size, split);
and the expanded macro begins ["gcc -E plzip.cc", pretty-printed]
   X1  do {
   X2    unsigned tmp; tmp = 2u * compr_lev[clidx].dict_size;
   X3    if ((size_t)-1 < tmp) {
   X4      show_error( "size_t overflow in sizeof_" "plain" "." ); fatal()
   X5    }

With 64-bit 'size_t' and with 32-bit 'unsigned', then the test on line X3
is essentially the same as discussed above.  Thus it should be removed
by conditional compilation in order to avoid a maintenance bug.

On a machine with 32-bit 'size-t' and 32-bit 'unsigned' (such as usual i686),
then things are worse. The types of both operands to the comparison on line
X3 are effectively the same (namely: 'unsigned'), and it is impossible for
the runtime value any variable such as 'tmp', to exceed the maximum value
for that type.  This is known at compile time and at programming time,
so again conditional compilation should remove the test.
That's the first problem.

Here's the second problem.  If (0x80000000u <= compr_lev[clidx].dict_size)
then the computation of
   unsigned tmp = 2u * compr_lev[clidx].dict_size;
suffers an overflow.  The value is *undefined*, and all subsequent
computation with that value is meaningless.  On most such machines
the overflow is ignored and the value happens to be the low-order 32-bits
of the infinite-width result.  In particular if .dict_size is
slightly more than 0x80000000u, then twice the value will be a small
positive [even] integer, and the comparison on line X3 will not
correspond to the message on line X4.  That's a logic bug.  Testing for
size-based resource exhaustion must avoid arithmetic overflow.  One of
the operands to the comparison should be a variable by itself; all the
constants should be folded into the other operand, the one which contains
the maximum value for the type.  So a better test is
   if (SIZE_T_MAX/2 < compr_lev[clidx].dict_size)
Notice that encapsulating this into a general macro often is tedious.
That is a good warning!  Avoid such macros; they tend to cause problems.

Careful readers may suggest that it is impossible for
  (0x80000000u <= compr_lev[clidx].dict_size)
because of resource restrictions.  However, a machine with 64-bit
'long' and pointers, but only 32-bit 'unsigned' and size_t, is legal;
and on such a machine the bad condition can arise easily.
(Similar to some environments for 8086 with LP32, I16, 'size_t' 16.)
Also, it is possible that previous failure of malloc() or 'new' or 'new []'
might have been ignored, leading to inconsistencies where otherwise-
redundant checks are useful.

There are more problems.
  607  SIZES(W2m_blk, compr, (4u + 1u + 1u)
  608      + ((unsigned)split_arg.sizeof_plain * 9u + 7u) / 8u + (4u + 8u + 8u),
  609      work);
Again, testing for size-based resource exhaustion must avoid arithmetic
overflow.  In this case "((unsigned)split_arg.sizeof_plain * 9u + 7u) / 8u"
is particularly suspect.  If (.sizeof_plain > 0x1c71c71c) then the 
multiplication
by 9u suffers an overflow, and the subsequent division by 8u gives total 
garbage.
It not immediately obvious that resource constraints can prevent
(.sizeof_plain > 0x1c71c71c).  The comparison of line X3 should be
between the uncombined member split_arg.sizeof_plain and some compile-time
constant expression.

One new problem is the "bare" constants such as in "* 9u + 7u) / 8u",
"(4u + 1u + 1u)", and "(4u + 8u + 8u)".  What is the source
of all those numerical values?  Is 'sizeof' involved in any of them?
In the definition of the SIZES macro itself:
  590          < arg ## _arg . sizeof_ ## arr - (size_t)1) { \

  595          + (arg ## _arg . sizeof_ ## arr - (size_t)1); \
the "(size_t)1" is really "sizeof(S2w_blk.plain)" or
"sizeof(W2m_blk.compr)".  The commonality of the '1' in SIZES with the
'1' in "S2w_blk.plain[1]" or in "W2m_blk.compr[1]" has been lost.
The definition of the SIZES macro has created a bad situation for maintenance.


In this case, careful analysis of these warnings from gcc reveals true bugs,
particularly the arithmetic overflows in testing for resource exhaustion.
Having to write out "UINT32_MAX" or "UINT64_MAX" [or "SIZE_T_MAX" after
   size_t const SIZE_T_MAX = ~(size_t)0;
] can provide a hint during programming that likely there are problems
to be avoided.

--




reply via email to

[Prev in Thread] Current Thread [Next in Thread]