[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: float precision

From: Paul Kienzle
Subject: Re: float precision
Date: Thu, 16 Jan 2003 01:15:53 -0500
User-agent: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:1.2.1) Gecko/20021130

Takayuki Ishida wrote:

How can I calculate everything in float precision?

You can't without a lot of effort.

One approach would be to convert all the code which currently
uses double precision to use single precision.  Unless you can
do this mechanically, I would not recommend it since octave is
changing so rapidly that maintaining such a patch would be a

Another approach would be to replace double in the octave code base with Real as appropriate, and have a typedef which defines Real as either double or float (or long double?) depending on a compile time option. This patch has a better chance of being accepted into octave, so reduces your work in the long run.

Some code (e.g., FFTW) needs to be compiled with the correct precision. You will need to use statically linked versions
of this code since it cannot simultaneously support different
precisions.  This may complicate the build process.

Other code needs to call different function names for different precisions. E.g., SGEMM and CGEMM rather than DGEMM and ZGEMM in LAPACK. You could define a set of include files e.g., lapack.h, to handle these:
   inline int xgemm(..., const double* A, ...) {
        return F77_XFCN (dgemm, DGEMM, (..., A, ...);
   inline int xgemm(..., const complex<double>* A, ...) {
        return F77_XFCN (zgemm, ZGEMM, (..., A, ...);
   inline int xgemm(..., const float* A, ...) {
        return F77_XFCN (sgemm, SGEMM, (..., A, ...);
   inline int xgemm(..., const complex<float>* A, ...) {
       return F77_XFCN (cgemm, CGEMM, (..., A, ...);
Then in {d,Cplx} you just need to call xgemm
rather than the specific gemm function for that type. I
hope the compiler is clever enough that you do not need
to link against the single precision blas if you are only
using the double precision blas.

Hmmm..., if you define enough of these sorts of things
then the code in CplxMatrix and dMatrix will be
identical --- maybe we want a template class Matrix<T> with
        typedef Matrix<Real> RealMatrix
        typedef Matrix<Complex> ComplexMatrix

Still other code may be hard-coded for double precision.  These
cases will need a little research to resolve.  You may be able
to use compiler flags to force single precision.  You may be
able to mechanically transform the fortran (e.g., real*8 -> real*4,
double precision -> real*4).  If you need to do the translation
by hand, I would suggest using a naming scheme like lapack's to
separate the versions with different precision.

Or you may want to leave the calculations in double precision. For example, the gamma function which is defined in libcruft/ slatec-fn/dgamma.f is currently exposed in liboctave/ as double xgamma(double). You could add the following to liboctave/lo-specfun.h:
  inline float xgamma(float x) { return float(xgamma(double(x))); }
and then you would be able to call it with Real defined as float
or double.

The third approach is to add single matrix and single complex
matrix types as user defined types.  This won't require any
changes to octave, and you can implement only as much of it as you need. Using dispatch from octave-forge, you can overload
the builtin functions like fft so that they call your own
version of fft which supports float rather than complex numbers.

If you find yourself duplicating much of dMatrix and CplxMatrix for this work, it would be in your best interests to patch octave
so that Matrix is defined as a template.

For compatibility, approach #3 is the correct approach.  For
controlling complexity, I think approach #2 is better.  The
question is how important it is to be able to do some parts of
the calculation using double precision and the rest using
single precision?  Is it rare enough that those who really
need it could define a DLD-FUNCTION which implements the
double portion while the rest of octave operates in single

Paul Kienzle

Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:
How to fund new projects:
Subscription information:

reply via email to

[Prev in Thread] Current Thread [Next in Thread]