|
From: | Marcus Müller |
Subject: | Re: Resampling radio data |
Date: | Wed, 17 Feb 2021 18:02:50 +0100 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 |
Hi Brian, On 17.02.21 16:55, Brian Padalino wrote: > On Wed, Feb 17, 2021 at 10:05 AM Marcus Müller <mueller@kit.edu > <mailto:mueller@kit.edu>> > wrote: > > Oh, sorry, didn't mean to imply that! FFT interpolation might work well > (if you can live > with the sinc sidelobes). > > > If the bandwidth is already constrained to 20MHz/23MHz, then there would be > no sidelobes - > correct? What the FFT resampler does is, in the end, interpolation with 23 sincs, each at the "original" spectral position. These sincs' sidelobes of course extent left and right beyond the 23 original "carriers" (thinking of OFDM here)! The question here is whether that matters: if the bandwidth of interest is -10 to +10 MHz, what do I care about what happens beyond that? The answer to that quesiton lies in Mark's application :) > I do have a question, though: Why do you go for a 4600-FFT? why not > simply a 23 FFT? > > Larger n yields greater efficiencies, right? Doing lots of small calls isn't > necessarily > as efficient as doing less large calls, so long as you can handle the latency > and the > processor can stay fed with data. I figure ~4000 samples is a good > compromise, but maybe > larger ones work out better, too. Well, quick calc, and acting as if O-notation was actually meaningful on hardware (see David's answer, he's got a point about caches: on my CPU, a 32×32 bit multiplication instruction (non-SIMD) takes about 1/400 of getting memory from random positions in RAM (cache misses), on average, IIRC. If we've got M samples to process, and we're doing N_1 or N_2=100·N_1 length transforms, the complexity of the first is a = O( (M/N_1) · N_·log(N_1) ) = O( M · log(N_1) ) of the second, it'd be O( (M/100/N_1) · 100·N_1 log(100·N_1) ) = O( M · log(100·N_1) ) = O( M · log(N_1) + M log(100) ) which is worse, but: > > For reference, here's some performance benchmarks for FFTW (randomly chosen > processor): > > http://www.fftw.org/speed/Pentium4-3.06GHz/ > <http://www.fftw.org/speed/Pentium4-3.06GHz/> > > Single precision, non powers of 2, 1d transforms are some good benchmarks to > look at for > something like this. Obviously some numbers are better than others, but the > general trend > seems to skew towards the mid-thousands as being a decent number to target > for maximizing > throughput. Real-world data, and I fully agree, always trumps O-notation (which simply is a bad model for any real-world problem, usually). Luckily, I've had this same problem before, so... See attachment! You should be able to call it like fft_rate -l 46 to set the FFT length to 46, and add a -n 10 to make the FFT do 10 threads (which usually hurts more than it helps). (Need to multiply the reported rate by the FFT length, it counts transforms, not samples.) The single-threaded 2²⁴-FFT I do runs at ca. 34.5 MS/s throughput. If I run fft_rate -l $(( 23*2*100 )) I get rougly 87.4 MS/s. If I run at fft_rate -l $(( 23*2 )) I get ca 92 MS/s. I'd call this a clear "meh" result! Best regards, Marcus
fft_rate.grc
Description: application/gnuradio-grc
fft_rate.png
Description: PNG image
fft_rate.py
Description: Text Data
smime.p7s
Description: S/MIME Cryptographic Signature
[Prev in Thread] | Current Thread | [Next in Thread] |