|
From: | Criss Swaim |
Subject: | Re: Maximum Number of Bins |
Date: | Thu, 29 Oct 2020 11:17:07 -0600 |
User-agent: | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 |
I have attached a png of the flow graph and the error msgs from the system log are below. These error messages are the only messages.
Oct 29 10:45:26 tf
abrt-hook-ccpp[378]: /var/spool/abrt is 23611049718 bytes
(more than 1279MiB), deleting
'ccpp-2020-10-27-15:30:43-28474'
Oct 29 10:45:07 tf abrt-hook-ccpp[378]: Process 329
(python2.7) of user 1000 killed by SIGSEGV - dumping core
Oct 29 10:45:07 tf audit[370]: ANOM_ABEND auid=1000 uid=1000
gid=1000 ses=8656
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
pid=370 comm="copy11" exe="/usr/bin/Oct 29 10:45:07 tf
audit[369]: ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=8656
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
pid=369 comm="analysis_sink_1" exe="Oct 29 10:45:07 tf
kernel: traps: copy11[370] general protection
ip:7f9e0acfdee0 sp:7f9c5a7fb590 error:0 in
libpthread-2.22.so[7f9e0acf1000+18000]
Oct 29 10:45:07 tf kernel: analysis_sink_1[369]: segfault at
7f9c5a7fd000 ip 00007f9dd9361d43 sp 00007f9c5a48a638 error 6
in libgnuradio-vandevender.so[7f9dd9336000+4d000]
Flow is USRP -> stream to vector -> fft -> complex to mag -> bin_sub_avg -> analysis_sinkf
bin_sub_avg (python) & analysis_sinkf (c/c++) are custom blocks.
the function of Bin Sub Avg, which is written in Python, is to start a background task which periodically (in this case hourly) samples the input signal, calculates the background noise and subtracts it from the signal that is passed the the Analysis_sinkf module.
Analys_sinkf monitors each bin and only when specific thresholds for the bin are met (ie duration, strength) is the signal written out to a signal file. Signals not passing the criteria are dropped.
This
code base has been running for over 3 years, with the original
system implementation about 8/9 years ago.
I have traced the problem to the input signal into bin_sub_avg when the number of fft bins is 3 million (2 million works). At 3 million bins, any reference to the result of the delete_head() function in the python code causes a failure. The python code just fails without a traceback, then the invalid data stream is passed to the analysis_sinkf module which is C/C++ and it causes the segment fault.
Thus my
suspicion is there is a limit in the fft block on the number
of bins it can handle and some variable is overflowing, but
this is a guess at this point. There may be a restriction in
the gr.signature_io module, but that seems unlikely.
Criss Swaim cswaim@tpginc.net cell: 505.301.5701
Sharing your flow-graph. The exact error messages and more context would be good Presumably you’re talking about FFT bins but it’s not clear. Also why are your samples being conveyed as strings ? That’s wildly inefficient. Sent from my iPhoneOn Oct 28, 2020, at 7:24 PM, Criss Swaim <cswaim@tpginc.net> wrote: I am working on a new application of gnuradio that pushes the limits--satellite-based detection of RF from rotating magnetized-quark-nugget dark matter transiting through the magnetosphere--and need as many bins as possible to reduce the background noise per frequency channel. I have successfully run with 2 million bins, but when I jump to 3 million bins, the application abends with a segment fault. I have deconstructed the following python line in a custom python block that is failing:raw_samps = numpy.fromstring(self._msgq.delete_head().to_string(), numpy.float32)and the failure is occurring while trying to convert the results from the delete_head() to a string (to_string()). Any reference to the result of the delete_head() functions results in an error. the _msgq is defined as _msgq = gr.msg_queue(MSGQ_LIMIT) where the MSGQ_LIMIT = 2 Here is the refactored code:# refactor raw_samps line test_str = self._msgq.delete_head() print("bin_sub_avg::got msg " sys.stdout.flush() print(test_str) sys.stdout.flush() test_string = test_str.to_string() print("bin_sub_avg::converted msg to string") sys.stdout.flush() raw_samps = numpy.fromstring(test_string, numpy.float32) print("bin_sub_avg::converted from string to numpy array") sys.stdout.flush()The output is: bin_sub_avg::got msg "the object for the shared pointer - test_str" (I did not save the exact message) Then the application aborts. Is there a limit on the number of bins gnuradio can handle? Any thoughts on how to find the cause or limit? -- Criss Swaim cswaim@tpginc.net cell: 505.301.5701
fd_analysis_flow.png
Description: PNG image
[Prev in Thread] | Current Thread | [Next in Thread] |