discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to tell whether gnuradio is using NEON or not?


From: Amr Bekhit
Subject: Re: How to tell whether gnuradio is using NEON or not?
Date: Sun, 3 Nov 2019 23:36:53 +0300

Hi Gregory,

I've just managed to make some progress - I ended up doing the following:
- Running ubuntu server 64-bit on the Raspberry Pi 4
- Compiling gnuradio from source using the built in gcc. There's a
minor issue here where cmake can't find liborc even though it's
installed and so carries out without it. I ended up creating a symlink
in /usr/local/include to point to /usr/include/orc-xxx/orc. This
allows cmake to detect it and build volk using orc optimizations.
- Running volk_profile once then allows volk to tune the library to
identify which functions run best using which machines.
- There's a USB bug in ubuntu server when it comes to the raspberry pi
- you need to apply the temporary workaround here until it is fixed
(https://bugs.launchpad.net/ubuntu/+source/linux-raspi2/+bug/1848790).

My test program runs faster: 60% CPU usage (compared to 80% previously
- I had mentioned 70%, but retesting it again it's more 80%), so
there's definitely an improvement. Just for comparison, if you install
gnuradio from the standard ubuntu repos, the test program runs at 65%
CPU on average, so compiling it from source does seem to provide some
benefit, probably due to the availability of NEON instructions (the
default libvolk in the repos only supports the generic_orc machine).

With these improvements my main application now runs at 75% CPU, which
gives enough headroom for the USB samples to be streamed fine to the
LimeSDR. It's not an amazing amount of headroom to be honest. For
example if I login via SSH, it causes lag in the SDR output for a
short while until the login process completes. But at this point,
assuming that my method of separating out the channels using bandpass
filters (as demonstrated in the example flow) is the efficient way of
doing it, then we're probably hitting the processing limits of the
Raspberry Pi or the optimisation limits of gnuradio (I compiled
v3.7.13.5 as LimeSDR is not supported yet in 3.8 - not sure if the
newer version perform better or not).

On Sun, 3 Nov 2019 at 19:40, Gregory Ratcliff <address@hidden> wrote:
>
> Please keep us updated on your progress.   This is something I was thinking 
> to do with aviation monitoring.  Stream each channel at the airport, time 
> shift to fill in silence giving priority to the tower and approach streams.
>
> Greg
>
> > On Nov 3, 2019, at 2:32 AM, Philip Balister <address@hidden> wrote:
> >
> > Raspbian is built for the original pi, that cpu does not have a neon
> > coprocessor. Basically, use a different distro that supports modern pi
> > hardware.
> >
> > Philip
> >
> >> On 11/3/19 8:10 AM, Amr Bekhit wrote:
> >> Hello all,
> >>
> >> I'm working on a project that involves selecting and filtering 10-15
> >> narrow channels (10kHz bandwidth) from a relatively broadband input
> >> (1Mhz). I've been working on trying to implement this as performant as
> >> possible using GNURadio companion (see this email thread
> >> https://lists.gnu.org/archive/html/discuss-gnuradio/2019-10/msg00192.html).
> >> I tried of couple of things (using FIR bandpass filters, mixing each
> >> channel down to 0Hz then low pass filtering (both in one step and in
> >> stages), using FIR bandpass filters) and found that simply using FIR
> >> bandpass filters for each channel seemed to provide the best
> >> performance CPU-wise (20% CPU usage on my i7-920 desktop PC). However,
> >> the aim is to run this system on a Raspberry Pi 4 and unfortunately,
> >> the same flow runs at approximately 90% CPU and seems to cause lags
> >> when sending the data to the SDR (LimeSDR-USB).
> >>
> >> I see the problem as potentially one of the following:
> >> - The flow is *still* not as efficient as it could be.
> >> - The RPi4 is just not powerful enough to run something like this and
> >> I need to use something more powerful (perhaps like the x86 Lattepanda
> >> boards?)
> >> - GNURadio is not compiled to use NEON optimisations.
> >>
> >> I've been exploring the last point recently and wanted to check
> >> whether NEON optimisations are indeed being utilised. So here's what I
> >> did:
> >> - I set up a Raspberry Pi 4 (4GB) using Raspbian Buster.
> >> - I installed GNURadio from the standard apt repository. This installs
> >> GNU Radio v3.7.13.4 and Volk 1.4
> >> - I ran volk_profile to tune the library.
> >> - I then run the bpf-test flow (attached to this email). The CPU usage is 
> >> 70%.
> >>
> >> Some info about the gnuradio and volk versions:
> >> gnuradio-config-info --cflags:
> >> /usr/bin/cc::: -g -O2
> >> -fdebug-prefix-map=/build/gnuradio-FK7QfY/gnuradio-3.7.13.4=.
> >> -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time
> >> -D_FORTIFY_SOURCE=2 -std=gnu99 -fvisibility=hidden -Wsign-compare
> >> -Wall -Wno-uninitialized
> >> /usr/bin/c++::: -g -O2
> >> -fdebug-prefix-map=/build/gnuradio-FK7QfY/gnuradio-3.7.13.4=.
> >> -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time
> >> -D_FORTIFY_SOURCE=2 -fvisibility=hidden -Wsign-compare -Wall
> >> -Wno-uninitialized
> >>
> >> volk-config-info --cflags:
> >> /usr/bin/cc::: -g -O2 -fdebug-prefix-map=/build/volk-zBrTqH/volk-1.4=.
> >> -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time
> >> -D_FORTIFY_SOURCE=2 -Wall
> >> /usr/bin/c++::: -g -O2
> >> -fdebug-prefix-map=/build/volk-zBrTqH/volk-1.4=.
> >> -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time
> >> -D_FORTIFY_SOURCE=2 -Wall
> >> generic_orc:::GNU::: -g -O2
> >> -fdebug-prefix-map=/build/volk-zBrTqH/volk-1.4=.
> >> -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time
> >> -D_FORTIFY_SOURCE=2 -Wall
> >>
> >> volk-config-info --avail-machines
> >> generic_orc;
> >>
> >> So based on that, it appears that the gnuradio and volk packages on
> >> Raspbian are not built with NEON support. I then set about compiling
> >> gnuradio and volk from source to ensure that NEON support is included.
> >> I compiled *both* volk and gnuradio using the
> >> arm_cortex_a72_hardfp_native.cmake toolchain file that is included in
> >> the cmake/Toolchains folder in the volk source. I compiled volk
> >> separately and then when compiling gnuradio set
> >> ENABLE_INTERNAL_VOLK=OFF. In this case, I ended up compiling Volk v2.0
> >> and gnuradio v3.9.0.0 (master from git). Here are the compiler flags:
> >>
> >> gnuradio-config-info --cflags
> >> /usr/bin/gcc:::-O3 -DNDEBUG -march=armv8-a -mtune=cortex-a72
> >> -mfpu=neon-fp-armv8 -mfloat-abi=hard -fvisibility=hidden
> >> -Wsign-compare -Wall -Wno-uninitialized
> >> /usr/bin/g++:::-O3 -DNDEBUG -march=armv8-a -mtune=cortex-a72
> >> -mfpu=neon-fp-armv8 -mfloat-abi=hard -fvisibility=hidden
> >> -Wsign-compare -Wall -Wno-uninitialized
> >>
> >> volk-config-info --cflags
> >> /usr/bin/gcc::: -march=armv8-a -mtune=cortex-a72 -mfpu=neon-fp-armv8
> >> -mfloat-abi=hard -Wall
> >> /usr/bin/g++::: -march=armv8-a -mtune=cortex-a72 -mfpu=neon-fp-armv8
> >> -mfloat-abi=hard -Wall
> >> generic_orc:::GNU:::-O3 -DNDEBUG -march=armv8-a -mtune=cortex-a72
> >> -mfpu=neon-fp-armv8 -mfloat-abi=hard -Wall
> >> neon_orc:::GNU:::-O3 -DNDEBUG -march=armv8-a -mtune=cortex-a72
> >> -mfpu=neon-fp-armv8 -mfloat-abi=hard -Wall -funsafe-math-optimizations
> >> neonv7_hardfp_orc:::GNU:::-O3 -DNDEBUG -march=armv8-a
> >> -mtune=cortex-a72 -mfpu=neon-fp-armv8 -mfloat-abi=hard -Wall
> >> -funsafe-math-optimizations -mfpu=neon -funsafe-math-optimizations
> >> -mfloat-abi=hard
> >>
> >> volk-config-info --avail-machines
> >> generic_orc;neon_orc;neonv7_hardfp_orc;
> >>
> >> volk-config-info --machine
> >> neonv7_hardfp_orc
> >>
> >> After running volk_profile to tune the library, I then ran the same
> >> flow, hoping that I'd get improved performance. Unfortunately, the
> >> performance was *exactly* the same, with CPU usage also at around 70%.
> >>
> >> I suspect one of the following:
> >> - The flow that I created is not using blocks written using
> >> Volk/optimised for NEON and as such enabling NEON support would make
> >> no difference (doubt it).
> >> - The gnuradio present in the Raspbian repositories *is actually*
> >> compiled using NEON support (despite the cflags showing otherwise) and
> >> I'm just simply running into the limitations of the CPU.
> >> - The gnuradio I compiled myself is actually *not using* NEON support
> >> (despite the cflags showing otherwise) and I need to figure out how to
> >> enable it.
> >>
> >> Any thoughts?
> >>
> >> Thanks,
> >>
> >> Amr
> >>
> >
>
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]