[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: pb with -ffast-math
From: |
Alain Baeckeroot |
Subject: |
Re: pb with -ffast-math |
Date: |
Fri, 17 Apr 2009 09:04:47 +0200 |
User-agent: |
KMail/1.9.9 |
Le 17/04/2009 à 07:07, Jaroslav Hajek écrit :
>
> On Thu, Apr 16, 2009 at 9:51 PM, Alain Baeckeroot
> <address@hidden> wrote:
...
> > Everythnig is vectorised, except one for-loop (iteration over time) which
> > takes 90% of time !
> > We are going to write this short part in C in a .oct file.
> >
>
> Could you post the relevant piece of code? Maybe there's a vectorized
> way you didn't see, or one that only works with development version,
I hope so :-) . I'll try 3.1.55 (packaged in debian experimental)
> or maybe it will be something worth a new function.
I cannot send the code, but i'll write a similar example asap (in several
days).
Each line needs the result of the previous one.
There are no funtion call
Only arithmetic operations, and 5 tests (one max, one min, one >, one <)
and one 'if' in the begining.
The loop looks like :
N = 10 000;
X = zeros(N,1) ; (and T1....)
x0 ;
for k = 1:N
if ( k == 1 )
Y(k) = some arithmetic ( x0 ,y0, params);
else
Y(k) = some arithmetic ( X(k-1), Y(k-1) params )
endif
T1(k) = some arithmetic ( X (k-1), Y(k), params)
T2(k) = some arithmetic (T1 (k), X(k-1), Y(k), params)
T3(k) = (T2(k) > 0) * T2(k) + (T1(k) < 0) * T1(k)
T20(k) = max ( T19(k) * T19(k), T18(k)*T18(k) )
X(k) = arithmetic ( T20(k), Y(k) )
endfor
I putted some tic/toc inside the loop (i don't know how to profile
octave code), there is no single place taking all the time.
Very rough order of magnitude : the computation is done at one Mflops,
when we expect at least 10 Mflops on a core2duo.
(vectorised pre and post processing are approximately 40 time faster)
Regards.
Alain
Re: pb with -ffast-math, Michael Creel, 2009/04/19