help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

strange problem


From: John W. Eaton
Subject: strange problem
Date: Fri, 28 Nov 2003 14:06:47 -0600

On 27-Nov-2003, Gerald Ebberink <address@hidden> wrote:

| I have a strange problem with a simple matrix multiplication
| and maybe you could help out (and maybe it was asked a dozen times 
| before, but I could not find it in the archives)
| 
| when i give the following command:
|   octave:1> [1,100;0,1]*[1,0;(-1/100),1]
| 
| I get this result
| 
| ans =
| 
|     -2.0817e-17    1.0000e+02
|     -1.0000e-02    1.0000e+00
| 
| which as far as I know is correct except for the upper left value.

If you look at the details of the floating point arithmetic
operations, then I think that each piece of the calculation is being
done correctly, but it doesn't produce the result you expect.

| If you do it by hand you would get something like
| 1 + (100*-1/100) = 0
| 
| octave agrees on with me on that part
| 
| octave:2> 1 + (100*-1/100)
| ans = 0

This calculation is done differently than your original example.

| so how does this strange number gets up there?

There has been a lot of guessing in this thread, which has probably
not helped too much.  Since the source code of Octave is available,
there is no need to guess -- we can look and see precisely what is
happening.

| and even better is there a solution because this gives me a unworkable 
| solution in the end.

As others have mentioned, if your algorithms fail due to the
inaccuracies of floating point storage and arithmetic, then you
probably need to modify the algorithm.

| I've used
| GNU Octave, version 2.1.49 (i686-pc-linux-gnu).
                                   ^^
I think this detail turns out to be important, because it is possible
for floating point arithmetic to be done in 80-bit registers on these
systems, and the quick answer is that this and smart compilers is what
is causing the problem you see.

First, an aside about floating point representation of decimal
numbers:

It is not possible to represent -1/100 exactly in binary floating point:

  octave:1> format bit
  octave:2> -1/100
  ans = 1011111110000100011110101110000101000111101011100001010001111011
        seeeeeeeeeeeffffffffffffffffffffffffffffffffffffffffffffffffffff

In the binary representation above, s == sign bit, e == exponent
bits, f = fraction bits.  This is equivalent to the decimal number

  (-1)^s*2^(E-1023)*(1.M)

with

  E = bin2dec ("eeeeeeeeeee") - 1023

and

  M = bin2dec ("ffffffffffffffffffffffffffffffffffffffffffffffffffff")

or, for the specific example above

  (-1)^1 * 2^(bin2dec("01111111000")-1023)
  * (1+bin2dec("0100011110101110000101000111101011100001010001111011")/2^52)

which is approximately (but not exactly) -0.01.

OTOH, it is posssible to represent something like -1/2 or -1/64 exactly:

  octave:3> -1/64
  ans = 1011111110010000000000000000000000000000000000000000000000000000
        seeeeeeeeeeeffffffffffffffffffffffffffffffffffffffffffffffffffff

In this case, we have

  (-1)^1*2^(bin2dec("01111111001")-1023)*(1+bin2dec("0")/2^52)

or -0.015625 (which is also exactly 1/64).

OK, now some more details about the matrix multiplication problem.

Octave uses dgemm from the blas to compute matrix multiplications.  No
LU decomposition (I'm still trying to figure out how that would be
helpful for a matrix multiplication) or other magic.

I can verify the result that you received with Octave using the
following small Fortran program and the Fortran version of dgemm:

      program foo
      double precision a(2,2), b(2,2), c(2,2)
      a(1,1) = 1.0d0
      a(1,2) = 100.0d0
      a(2,1) = 0.0d0
      a(2,2) = 1.0d0
      b(1,1) = 1.0d0
      b(1,2) = 0.0d0
      b(2,1) = -1.0d0/100.0d0
      b(2,2) = 1.0d0
      call dgemm ('n', 'n', 2, 2, 2, 1.0d0, a, 2, b, 2, 0.0d0, c, 2)
      print *, c(1,1), c(1,2)
      print *, c(2,1), c(2,2)
      end

When I run this on a x86 system, I see the following results:

  -2.08166817E-17  100.
  -0.01  1.

so the problem (if there is one) is really in dgemm, not Octave.

Looking at dgemm, the code for multiplying two matrices without
transposing them is essentially

            DO 90, J = 1, N
               DO 50, I = 1, M
                  C( I, J ) = ZERO
   50          CONTINUE
               DO 80, L = 1, K
                  DO 70, I = 1, M
                     C( I, J ) = C( I, J ) + B( L, J )*A( I, L )
   70             CONTINUE
   80          CONTINUE
   90       CONTINUE

So instead of calling dgemm, we should be able to examine the
following program to see what is going on

      program foo
      double precision a(2,2), b(2,2), c(2,2)
      double precision zero, temp
      integer i, j, k, l, m, n
      n = 2
      m = 2
      k = 2
      zero = 0.0d0
      a(1,1) = 1.0d0
      a(1,2) = 100.0d0
      a(2,1) = 0.0d0
      a(2,2) = 1.0d0
      b(1,1) = 1.0d0
      b(1,2) = 0.0d0
      b(2,1) = -1.0d0/100.0d0
      b(2,2) = 1.0d0
      DO 90, J = 1, N
         DO 50, I = 1, M
            C( I, J ) = ZERO
   50    CONTINUE
         DO 80, L = 1, K
            DO 70, I = 1, M
               print *, 'c(', i, ',', j, ') = ',
     $             c(i,j), ' + ', b(l,j), ' * ', a(i,l)
               C( I, J ) = C( I, J ) + B( L, J )*A( I, L )
               print *, '   = ', c(i,j)
   70       CONTINUE
   80    CONTINUE
   90 CONTINUE
      print *, c(1,1), c(1,2)
      print *, c(2,1), c(2,2)
      end

(I added the intermediate print statements so we could look at what is
happening at each step.)

Now the final results are the same as before, but we also see the
following strange intermediate output:

 c( 1, 1) =   0. +   1. *   1.
    =   1.
 c( 2, 1) =   0. +   1. *   0.
    =   0.
 c( 1, 1) =   1. +  -0.01 *   100.
    =  -2.08166817E-17
 ...

How can this happen?

Some hardware like the 8087 and numerical processors that followed
have 80-bit extended precision registers and can do calculations using
this 80-bit format.  The extended precision is usually helpful, but
sometimes it can cause trouble.

When you do a calculation like

  x = a + b*c

on hardware with extended precision registers, all the calcuations on
the RHS may be done in extended precision.  Only when the result is
stored is it truncated to a 64-bit representation.  However, this is
not true for Octave, because all temporaries are stored.  So when you
calculate

  result = 1 + (100*-1/100)

this is the equivalent of

  t1 = -1;
  t2 = t1/100
  t3 = 100*t2
  result = 1 + t3

at each step, the intermediate results will be stored in a 64-bit
representation.

However, when you do a calculation like

  [1,100;0,1]*[1,0;(-1/100),1]

Octave calls dgemm from the BLAS, which does all the computations in
Fortran (or perhaps C or assembly if you are using ATLAS or some other
vendor-supplied version of the BLAS) and which has the opportunity to
use extended precision.

So, in the calculation that resulted in

 c( 1, 1) =   1. +  -0.01 *   100.
    =  -2.08166817E-17

the value -0.01 is only displayed this way; it's value is really not
precisely -0.01, and when it is converted to extended precision and
multiplied by 100, the result is not precisely -1.  So adding it to 1
(again, in extended precision mode) produces a result that is not zero,
and it happens that the difference shows up not only in the extra
bits, but also in one of the mantissa bits that remain after
converting back to the 64-bit representation that is eventually
returned to Octave.

Sometimes a compiler can be smart and eliminate temporaries, and get
extended precision even if you explicitly write a series of
calculations like the above.

GCC (including g77 and g++) have an option -ffloat-store to force
intermediate values to be stored in memory rather than extended
precision registers, but this only affects variables that are
explicitly stored in temporaries.  So if you write

  r = a + b*c

the intermediate result of b*c may still be stored in an extended
precision register.

Unfortunately, -ffloat-store can significantly reduce the speed of
your computations because it forces data to be moved from on-chip
registers to memory.  (It would be nice if you could simply request
that the hardware not use the extended precision, but as far as I
know, that is not possible on most if not all 8087-like processors).

Compiling the Fortran code above on an x86 system with -O
-ffloat-store doesn't change things.  But rearranging the innermost
loop slightly to be

            DO 70, I = 1, M
               temp = B( L, J )*A( I, L )
               C( I, J ) = C( I, J ) + temp
   70       CONTINUE

*and* compiling with -O -ffloat-store (or no optimization at all, but
that is probably not what you really want to do), produces the results
you expect:

   0.  100.
  -0.01  1.

But we probably can't get everyone to agree that this is a good
solution, and even if you do this, you will probably still find some
other calculations that produce results that don't match precisely
what you would get with exact decimal arithmetic.

jwe



-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:  http://www.octave.org
How to fund new projects:  http://www.octave.org/funding.html
Subscription information:  http://www.octave.org/archive.html
-------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]