discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Weird Python Sinks Crash with GNU Radio 3.8.1


From: Gilad Beeri (ApolloShield)
Subject: Re: Weird Python Sinks Crash with GNU Radio 3.8.1
Date: Fri, 1 May 2020 12:35:15 +0300

Christoph,
you're spot on! 
I confirm that on my 3.8.1.0 build your suggestion makes the flowgraph behave properly.
So simple...

I guess it's because the flowgraph stores the reference in its C++ core, and nothing keeps a reference to the Python block, so it gets garbage-collected (this is natural with the Python binding but unintuitive for Python developers, who are used to expect strong referencing).

I still don't have an idea why the commits I mentioned break and fix the issue, and why it only reproduces when there are at least 2 Python blocks, but that's less important.

Marcus, what do you think?

Regards,
Gilad Beeri


On Fri, May 1, 2020 at 11:18 AM Christoph Mayer <address@hidden> wrote:
Hi all,

have you noticed that source1,source2,my_null_sink1,my_null_sink2 are local variables in the constructor for phython_sink, so they get destroyed/garbage collected at the exit of the __init__ method?

When using
   self.source1 = ...
   self.my_null_sink1 = ...
   # etc...
there is no crash.

Christoph

BTW: it can be useful to build gnuradio with address sanitizer which

C=clang CXX=clang++ cmake ../ -DENABLE_DEFAULT=OFF -DENABLE_GNURADIO_RUNTIME=ON -DENABLE_PYTHON=ON -DENABLE_POSTINSTALL=OFF -DENABLE_DOXYGEN=OFF -DCMAKE_CXX_FLAGS="-fsanitize=address -fno-omit-frame-pointer" -DCMAKE_C_FLAGS="-fsanitize=address -fno-omit-frame-pointer" -DCMAKE_INSTALL_PREFIX=/opt/gnuradio-test  -DPYTHON_EXECUTABLE=/usr/bin/python3 -DCMAKE_MODULE_PATH=/opt/gnuradio-test

LD_PRELOAD=$(clang -print-file-name=libclang_rt.asan-x86_64.so) python3 python_block_bug.py



class python_sink(gr.top_block):
    def __init__(self):
        gr.top_block.__init__(self, "Testing Python Sinks")
        # source1 = blocks.null_source(gr.sizeof_gr_complex)
        # source2 = blocks.null_source(gr.sizeof_gr_complex)
        source1 = my_null_source(item=numpy.complex64)
        source2 = my_null_source(item=numpy.complex64)
        # null_sink1 = blocks.null_sink(gr.sizeof_gr_complex)
        my_null_sink1 = my_null_sink(item=numpy.complex64)
        my_null_sink2 = my_null_sink(item=numpy.complex64)

        # self.connect(source1, null_sink1)
        self.connect(source1, my_null_sink1)
        self.connect(source2, my_null_sink2)



On Fri, May 1, 2020 at 9:37 AM Gilad Beeri (ApolloShield) <address@hidden> wrote:
More details.

I built GR with "cmake ../ -DPYTHON_EXECUTABLE=/usr/bin/python3 -DENABLE_DEFAULT=OFF -DENABLE_GNURADIO_RUNTIME=ON -DENABLE_PYTHON=ON -DENABLE_POSTINSTALL=OFF -DENABLE_DOXYGEN=OFF" in order to have the minimal build possible that reproduces the issue, and used the attached flowgraph. It's the same one as before, but I replaced Null Source with my_null_source (a Python implementation) in order to get rid of gnuradio-blocks (for speedier builds).

I don't understand why, but it seems that v3.7.8.2 is still ok, the bug starts at git commit 1206251231696359270a260508551e044f3af33a and breaks all versions from v3.7.9 to 3.7.11.0, then git commit 713629cce8d571570bc5f0f0db67c5a96d5ee071 seems to have fixed it in 3.7.12.0 and the rest of 3.7.* releases. The breaking commit is in the 3.8 history because the merge from next that came from 3.7.12.0 broke it again.

I will be happy if as a first step, someone will be able to confirm my findings:

  1. a06420691493534ca268ce52e1f16504c216828d is ok, but the next commit 1206251231696359270a260508551e044f3af33a is broken (confirming the issue in old 3.7 releases and 3.8).
  2. e635ae442132a7e3bab75796d2ac0b66bd289bdb is broken, but the next commit 713629cce8d571570bc5f0f0db67c5a96d5ee071 is ok (confirming the fix in the 3.7 history).
As a second step, can someone understand why the breaking commit broke and why the fixing commit fixed, and then apply the fixing logic on master?

I opened a GitHub issue with most of the information detailed here, even though we don't have all the details yet. https://github.com/gnuradio/gnuradio/issues/3435.


Analyzing a non-linear history is some dirty work :)
How searching for the breaking commit in 3.8's history looks like:
image.png

Regards,

Gilad Beeri

On Wed, Apr 29, 2020 at 2:05 PM Volker Schroer <address@hidden> wrote:
Tested on Ubuntu 19.10 with python 3.7.5

Crashing.

Backtrace gives

Thread 3 "my_null_sink4" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe8939700 (LWP 7033)]
0x00007ffff265bed1 in gr::py_feval_ll::calleval(long) () from
/usr/local/gnuradio/lib/python3/dist-packages/gnuradio/gr/_runtime_swig.so
(gdb) bt
#0  0x00007ffff265bed1 in gr::py_feval_ll::calleval(long) () from
/usr/local/gnuradio/lib/python3/dist-packages/gnuradio/gr/_runtime_swig.so
#1  0x00007ffff6caf2ea in gr::block_gateway_impl::work(int,
std::vector<void const*, std::allocator<void const*> >&,
std::vector<void*, std::allocator<void*> >&) ()
    from /usr/local/gnuradio/lib/libgnuradio-runtime.so.3.9.0
#2  0x00007ffff6caf423 in gr::block_gateway_impl::general_work(int,
std::vector<int, std::allocator<int> >&, std::vector<void const*,
std::allocator<void const*> >&, std::vector<void*, std::allocator<void*> >&)
     () from /usr/local/gnuradio/lib/libgnuradio-runtime.so.3.9.0
#3  0x00007ffff6cad1a3 in gr::block_executor::run_one_iteration() ()
from /usr/local/gnuradio/lib/libgnuradio-runtime.so.3.9.0
#4  0x00007ffff6d03093 in
gr::tpb_thread_body::tpb_thread_body(boost::shared_ptr<gr::block>,
boost::shared_ptr<boost::barrier>, int) () from
/usr/local/gnuradio/lib/libgnuradio-runtime.so.3.9.0
#5  0x00007ffff6cf498e in
boost::detail::function::void_function_obj_invoker0<gr::thread::thread_body_wrapper<gr::tpb_container>,
void>::invoke(boost::detail::function::function_buffer&) ()
    from /usr/local/gnuradio/lib/libgnuradio-runtime.so.3.9.0
#6  0x00007ffff6d105c6 in
boost::detail::thread_data<boost::function0<void> >::run() () from
/usr/local/gnuradio/lib/libgnuradio-runtime.so.3.9.0
#7  0x00007ffff71f61b5 in ?? () from
/usr/lib/x86_64-linux-gnu/libboost_thread.so.1.67.0
#8  0x00007ffff7d9c669 in start_thread (arg=<optimized out>) at
pthread_create.c:479
#9  0x00007ffff7ed8323 in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95


By the way: trying the gnuradio version with pybind instead of swig
crashes too.

-- Volker



reply via email to

[Prev in Thread] Current Thread [Next in Thread]