discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Resampling radio data


From: Marcus Müller
Subject: Re: Resampling radio data
Date: Wed, 17 Feb 2021 18:02:50 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0

Hi Brian,

On 17.02.21 16:55, Brian Padalino wrote:
> On Wed, Feb 17, 2021 at 10:05 AM Marcus Müller <mueller@kit.edu 
> <mailto:mueller@kit.edu>>
> wrote:
> 
>     Oh, sorry, didn't mean to imply that! FFT interpolation might work well 
> (if you can live
>     with the sinc sidelobes).
> 
> 
> If the bandwidth is already constrained to 20MHz/23MHz, then there would be 
> no sidelobes -
> correct?

What the FFT resampler does is, in the end, interpolation with 23 sincs, each 
at the
"original" spectral position.
These sincs' sidelobes of course extent left and right beyond the 23 original 
"carriers"
(thinking of OFDM here)!

The question here is whether that matters: if the bandwidth of interest is -10 
to +10 MHz,
what do I care about what happens beyond that? The answer to that quesiton lies 
in Mark's
application :)

>     I do have a question, though: Why do you go for a 4600-FFT? why not 
> simply a 23 FFT?
> 
> Larger n yields greater efficiencies, right?  Doing lots of small calls isn't 
> necessarily
> as efficient as doing less large calls, so long as you can handle the latency 
> and the
> processor can stay fed with data.  I figure ~4000 samples is a good 
> compromise, but maybe
> larger ones work out better, too.

Well, quick calc, and acting as if O-notation was actually meaningful on 
hardware (see
David's answer, he's got a point about caches: on my CPU, a 32×32 bit 
multiplication
instruction (non-SIMD) takes about 1/400 of getting memory from random 
positions in RAM
(cache misses), on average, IIRC.

If we've got M samples to process, and we're doing N_1 or N_2=100·N_1 length 
transforms,
the complexity of the first is

a = O( (M/N_1) · N_·log(N_1) ) = O( M · log(N_1) )

of the second, it'd be

O( (M/100/N_1) · 100·N_1 log(100·N_1) ) = O( M · log(100·N_1) )
= O( M · log(N_1) + M log(100) )

which is worse, but:

> 
> For reference, here's some performance benchmarks for FFTW (randomly chosen 
> processor):
> 
>   http://www.fftw.org/speed/Pentium4-3.06GHz/ 
> <http://www.fftw.org/speed/Pentium4-3.06GHz/>
> 
> Single precision, non powers of 2, 1d transforms are some good benchmarks to 
> look at for
> something like this.  Obviously some numbers are better than others, but the 
> general trend
> seems to skew towards the mid-thousands as being a decent number to target 
> for maximizing
> throughput.

Real-world data, and I fully agree, always trumps O-notation (which simply is a 
bad model
for any real-world problem, usually).

Luckily, I've had this same problem before, so... See attachment!

You should be able to call it like

fft_rate -l 46

to set the FFT length to 46, and add a -n 10 to make the FFT do 10 threads 
(which usually
hurts more than it helps). (Need to multiply the reported rate by the FFT 
length, it
counts transforms, not samples.)

The single-threaded 2²⁴-FFT I do runs at ca. 34.5 MS/s throughput. If I run

fft_rate -l $(( 23*2*100 ))

I get rougly 87.4 MS/s.
If I run at

fft_rate -l $(( 23*2 ))

I get ca 92 MS/s.

I'd call this a clear "meh" result!

Best regards,
Marcus


Attachment: fft_rate.grc
Description: application/gnuradio-grc

Attachment: fft_rate.png
Description: PNG image

Attachment: fft_rate.py
Description: Text Data

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]