help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: strange FFT timing


From: Sergei Steshenko
Subject: Re: strange FFT timing
Date: Sun, 20 Apr 2008 15:50:27 -0700 (PDT)

--- David Bateman <address@hidden> wrote:

> Przemek Klosowski wrote:
> >  I would expect that FFT would be fasterst for datasets of the size
> > that is a power of 2, but a quick test doesn't bear this out. I am
> > using Octave Version 3.0.0 on Fedora 8 (2.6.24) runing on a 2.1 GHz
> > Athlon XP 2800+ with 1GB of memory (no swap activity during the
> > calculation). The 1,000,000 element FFT is almost twice as fast as
> > 2^20 element FFT (see below for excerpt of the Octave session).
> > 
> > Anyone would venture what is going on? It could be cache aliasing
> > (this Athlon has a 512kB cache). In any case, FFTW seems to be very
> > good handling the non-power-of-2 sizes, and seems
> > to be optimized for power-of-ten sizes.
> > 
> > v=rand(2^20,1);  tic;fft(v);toc
> > Elapsed time is 0.339144 seconds.
> > Elapsed time is 0.290032 seconds.
> > Elapsed time is 0.283474 seconds.
> > Elapsed time is 0.291376 seconds.
> > Elapsed time is 0.285946 seconds.
> > v=rand(1e6,1);   tic;fft(v);toc
> > Elapsed time is 0.223884 seconds.
> > Elapsed time is 0.179272 seconds.
> > Elapsed time is 0.179442 seconds.
> > Elapsed time is 0.184591 seconds.
> > Elapsed time is 0.179012 seconds.
> > 
> > Similar results obtained with  v=rand(1e6,1);x=cputime;fft(v);cputime-x
> 
> At a guess I'd say SIMD memory alignment. With 16byte data alignment
> FFTW can use the SSE2/SSE3 instructions to speed things up. Maybe your
> 1e6 case just got lucky. I see the reverse behavior..
> 
> D.
> 
> _______________________________________________
> Help-octave mailing list
> address@hidden
> https://www.cae.wisc.edu/mailman/listinfo/help-octave
> 

On my machine - Athlon AM2 X2 4600+:

"
octave:1> v=rand(2^20,1);  tic;fft(v);toc
Elapsed time is 0.244031 seconds.
octave:2> tic;fft(v);toc
Elapsed time is 0.199732 seconds.
octave:3> tic;fft(v);toc
Elapsed time is 0.199396 seconds.
octave:4> tic;fft(v);toc
Elapsed time is 0.197449 seconds.
octave:5> tic;fft(v);toc
Elapsed time is 0.198605 seconds.
octave:6> v=rand(1e6,1);   tic;fft(v);toc
Elapsed time is 0.184027 seconds.
octave:7> tic;fft(v);toc
Elapsed time is 0.15657 seconds.
octave:8> tic;fft(v);toc
Elapsed time is 0.144034 seconds.
octave:9> tic;fft(v);toc
Elapsed time is 0.152095 seconds.
octave:10> tic;fft(v);toc
Elapsed time is 0.155661 seconds.
octave:11>
"

1e6 is faster than 2^20.

FFTW3 is with SSE2 support.

Regards,
  Sergei.

Applications From Scratch: http://appsfromscratch.berlios.de/


      
____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ


reply via email to

[Prev in Thread] Current Thread [Next in Thread]