[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Discuss-gnuradio] A lot confused by volk_32f_cos_32f()
From: |
Dennis Glatting |
Subject: |
Re: [Discuss-gnuradio] A lot confused by volk_32f_cos_32f() |
Date: |
Fri, 15 Jan 2016 16:29:04 -0800 |
On Fri, 2016-01-15 at 15:04 -0800, Ron Economos wrote:
> Version v1.1-22-g99594b12 is too old. It is from before the 2fa96d9
> commit.
>
> You need Volk version 1.2 or 1.1.2.
>
Yep. git wasn't pulling the submodule. Thanks.
> Ron
>
> On 01/15/2016 02:52 PM, Dennis Glatting wrote:
> > On Fri, 2016-01-15 at 14:28 -0800, Ron Economos wrote:
> > > This issue has been fixed just recently.
> > >
> > > https://github.com/gnuradio/volk/issues/52
> > >
> > If I understand
> > https://github.com/n-west/volk/commit/2fa96d970dcb582ac
> > 8a6c65ec2088df2a79747d5 (dated 1Dec2015) correctly, I am running
> > the
> > updated code installed on 12Jan2016:
> >
> >
> > address@hidden:$ volk-config-info -v
> > v1.1-22-g99594b12
> >
> >
> > address@hidden:$ ldd a.out
> > linux-vdso.so.1 => (0x00007fffc23ea000)
> > libvolk.so.1.1git => /usr/local/lib/libvolk.so.1.1git
> > (0x00007f59b0370000)
> > libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> > (0x00007f59affe8000)
> > libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6
> > (0x00007f59afce0000)
> > libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
> > (0x00007f59afac8000)
> > libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
> > (0x00007f59af6f8000)
> > liborc-0.4.so.0 => /usr/lib/x86_64-linux-gnu/liborc-0.4.so.0
> > (0x00007f59af470000)
> > /lib64/ld-linux-x86-64.so.2 (0x00007f59b07f8000)
> > libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
> > (0x00007f59af250000)
> >
> >
> > address@hidden:# ls -lh /usr/local/lib/libvolk.so.1.1git
> > /usr/local/lib/libvolk.so
> > lrwxrwxrwx 1 root root 17 Jan 12 10:19 /usr/local/lib/libvolk.so
> > ->
> > libvolk.so.1.1git
> > -rw-r--r-- 1 root root 2.7M Jan 12 10:41
> > /usr/local/lib/libvolk.so.1.1git
> >
> >
> > There are no odd libvolk* libraries in my system.
> >
> > BTW, I am running:
> >
> >
> > address@hidden:# cat /proc/cpuinfo
> > processor : 0
> > vendor_id : AuthenticAMD
> > cpu family : 21
> > model : 2
> > model name : AMD FX(tm)-9590 Eight-Core Processor
> > stepping : 0
> > microcode : 0x6000822
> > cpu MHz : 4721.634
> > cache size : 2048 KB
> > physical id : 0
> > siblings : 8
> > core id : 0
> > cpu cores : 4
> > apicid : 16
> > initial apicid : 0
> > fpu : yes
> > fpu_exception : yes
> > cpuid level : 13
> > wp : yes
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep
> > mtrr
> > pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx
> > mmxext
> > fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc
> > extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1
> > sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic
> > cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit
> > wdt
> > lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat
> > cpb
> > hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean
> > flushbyasid
> > decodeassists pausefilter pfthreshold vmmcall bmi1
> > bugs : fxsave_leak
> > bogomips : 9443.26
> > TLB size : 1536 4K pages
> > clflush size : 64
> > cache_alignment : 64
> > address sizes : 48 bits physical, 48 bits virtual
> > power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro
> >
> > processor : 1
> > vendor_id : AuthenticAMD
> > cpu family : 21
> > model : 2
> > model name : AMD FX(tm)-9590 Eight-Core Processor
> > stepping : 0
> > microcode : 0x6000822
> > cpu MHz : 4721.634
> > cache size : 2048 KB
> > physical id : 0
> > siblings : 8
> > core id : 1
> > cpu cores : 4
> > apicid : 17
> > initial apicid : 1
> > fpu : yes
> > fpu_exception : yes
> > cpuid level : 13
> > wp : yes
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep
> > mtrr
> > pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx
> > mmxext
> > fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc
> > extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1
> > sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic
> > cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit
> > wdt
> > lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat
> > cpb
> > hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean
> > flushbyasid
> > decodeassists pausefilter pfthreshold vmmcall bmi1
> > bugs : fxsave_leak
> > bogomips : 9443.26
> > TLB size : 1536 4K pages
> > clflush size : 64
> > cache_alignment : 64
> > address sizes : 48 bits physical, 48 bits virtual
> > power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro
> >
> > processor : 2
> > vendor_id : AuthenticAMD
> > cpu family : 21
> > model : 2
> > model name : AMD FX(tm)-9590 Eight-Core Processor
> > stepping : 0
> > microcode : 0x6000822
> > cpu MHz : 4721.634
> > cache size : 2048 KB
> > physical id : 0
> > siblings : 8
> > core id : 2
> > cpu cores : 4
> > apicid : 18
> > initial apicid : 2
> > fpu : yes
> > fpu_exception : yes
> > cpuid level : 13
> > wp : yes
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep
> > mtrr
> > pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx
> > mmxext
> > fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc
> > extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1
> > sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic
> > cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit
> > wdt
> > lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat
> > cpb
> > hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean
> > flushbyasid
> > decodeassists pausefilter pfthreshold vmmcall bmi1
> > bugs : fxsave_leak
> > bogomips : 9443.26
> > TLB size : 1536 4K pages
> > clflush size : 64
> > cache_alignment : 64
> > address sizes : 48 bits physical, 48 bits virtual
> > power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro
> >
> > processor : 3
> > vendor_id : AuthenticAMD
> > cpu family : 21
> > model : 2
> > model name : AMD FX(tm)-9590 Eight-Core Processor
> > stepping : 0
> > microcode : 0x6000822
> > cpu MHz : 4721.634
> > cache size : 2048 KB
> > physical id : 0
> > siblings : 8
> > core id : 3
> > cpu cores : 4
> > apicid : 19
> > initial apicid : 3
> > fpu : yes
> > fpu_exception : yes
> > cpuid level : 13
> > wp : yes
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep
> > mtrr
> > pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx
> > mmxext
> > fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc
> > extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1
> > sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic
> > cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit
> > wdt
> > lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat
> > cpb
> > hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean
> > flushbyasid
> > decodeassists pausefilter pfthreshold vmmcall bmi1
> > bugs : fxsave_leak
> > bogomips : 9443.26
> > TLB size : 1536 4K pages
> > clflush size : 64
> > cache_alignment : 64
> > address sizes : 48 bits physical, 48 bits virtual
> > power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro
> >
> > processor : 4
> > vendor_id : AuthenticAMD
> > cpu family : 21
> > model : 2
> > model name : AMD FX(tm)-9590 Eight-Core Processor
> > stepping : 0
> > microcode : 0x6000822
> > cpu MHz : 4721.634
> > cache size : 2048 KB
> > physical id : 0
> > siblings : 8
> > core id : 4
> > cpu cores : 4
> > apicid : 20
> > initial apicid : 4
> > fpu : yes
> > fpu_exception : yes
> > cpuid level : 13
> > wp : yes
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep
> > mtrr
> > pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx
> > mmxext
> > fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc
> > extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1
> > sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic
> > cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit
> > wdt
> > lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat
> > cpb
> > hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean
> > flushbyasid
> > decodeassists pausefilter pfthreshold vmmcall bmi1
> > bugs : fxsave_leak
> > bogomips : 9443.26
> > TLB size : 1536 4K pages
> > clflush size : 64
> > cache_alignment : 64
> > address sizes : 48 bits physical, 48 bits virtual
> > power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro
> >
> > processor : 5
> > vendor_id : AuthenticAMD
> > cpu family : 21
> > model : 2
> > model name : AMD FX(tm)-9590 Eight-Core Processor
> > stepping : 0
> > microcode : 0x6000822
> > cpu MHz : 4721.634
> > cache size : 2048 KB
> > physical id : 0
> > siblings : 8
> > core id : 5
> > cpu cores : 4
> > apicid : 21
> > initial apicid : 5
> > fpu : yes
> > fpu_exception : yes
> > cpuid level : 13
> > wp : yes
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep
> > mtrr
> > pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx
> > mmxext
> > fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc
> > extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1
> > sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic
> > cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit
> > wdt
> > lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat
> > cpb
> > hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean
> > flushbyasid
> > decodeassists pausefilter pfthreshold vmmcall bmi1
> > bugs : fxsave_leak
> > bogomips : 9443.26
> > TLB size : 1536 4K pages
> > clflush size : 64
> > cache_alignment : 64
> > address sizes : 48 bits physical, 48 bits virtual
> > power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro
> >
> > processor : 6
> > vendor_id : AuthenticAMD
> > cpu family : 21
> > model : 2
> > model name : AMD FX(tm)-9590 Eight-Core Processor
> > stepping : 0
> > microcode : 0x6000822
> > cpu MHz : 4721.634
> > cache size : 2048 KB
> > physical id : 0
> > siblings : 8
> > core id : 6
> > cpu cores : 4
> > apicid : 22
> > initial apicid : 6
> > fpu : yes
> > fpu_exception : yes
> > cpuid level : 13
> > wp : yes
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep
> > mtrr
> > pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx
> > mmxext
> > fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc
> > extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1
> > sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic
> > cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit
> > wdt
> > lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat
> > cpb
> > hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean
> > flushbyasid
> > decodeassists pausefilter pfthreshold vmmcall bmi1
> > bugs : fxsave_leak
> > bogomips : 9443.26
> > TLB size : 1536 4K pages
> > clflush size : 64
> > cache_alignment : 64
> > address sizes : 48 bits physical, 48 bits virtual
> > power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro
> >
> > processor : 7
> > vendor_id : AuthenticAMD
> > cpu family : 21
> > model : 2
> > model name : AMD FX(tm)-9590 Eight-Core Processor
> > stepping : 0
> > microcode : 0x6000822
> > cpu MHz : 4721.634
> > cache size : 2048 KB
> > physical id : 0
> > siblings : 8
> > core id : 7
> > cpu cores : 4
> > apicid : 23
> > initial apicid : 7
> > fpu : yes
> > fpu_exception : yes
> > cpuid level : 13
> > wp : yes
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep
> > mtrr
> > pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx
> > mmxext
> > fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc
> > extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1
> > sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic
> > cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit
> > wdt
> > lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat
> > cpb
> > hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean
> > flushbyasid
> > decodeassists pausefilter pfthreshold vmmcall bmi1
> > bugs : fxsave_leak
> > bogomips : 9443.26
> > TLB size : 1536 4K pages
> > clflush size : 64
> > cache_alignment : 64
> > address sizes : 48 bits physical, 48 bits virtual
> > power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro
> >
> >
> >
> >
> >
> >
> > > Ron
> > >
> > > On 01/15/2016 02:20 PM, Dennis Glatting wrote:
> > > > I am confused by this function because the output DOES NOT
> > > > match
> > > > ::cos() as I believe is demonstrated in the web page:
> > > >
> > > > http://libvolk.org/doxygen/volk_32f_cos_32f.html
> > > >
> > > > I /think/ I am doing the same thing as the web page.
> > > > Regardless, my
> > > > code (below) output is the following with sign mismatches at
> > > > i=4
> > > > and
> > > > i=14:
> > > >
> > > >
> > > > address@hidden:~/gr-acars-code/3.7.5/lib$ ./a.out
> > > > Using Volk machine: avx_64_mmx_orc
> > > > i=0, input=0, volk=1, cos()=1
> > > > i=1, input=0.314159, volk=0.951056, cos()=0.951057
> > > > i=2, input=0.628319, volk=0.809017, cos()=0.809017
> > > > i=3, input=0.942478, volk=0.587785, cos()=0.587785
> > > > i=4, input=1.25664, volk=-0.309017, cos()=0.309017
> > > > i=5, input=1.5708, volk=-1.19209e-07, cos()=-4.37114e-08
> > > > i=6, input=1.88496, volk=-0.309017, cos()=-0.309017
> > > > i=7, input=2.19911, volk=-0.587785, cos()=-0.587785
> > > > i=8, input=2.51327, volk=-0.809017, cos()=-0.809017
> > > > i=9, input=2.82743, volk=-0.951057, cos()=-0.951057
> > > > i=10, input=3.14159, volk=-1, cos()=-1
> > > > i=11, input=3.45575, volk=-0.951056, cos()=-0.951057
> > > > i=12, input=3.76991, volk=-0.809017, cos()=-0.809017
> > > > i=13, input=4.08407, volk=-0.587785, cos()=-0.587785
> > > > i=14, input=4.39823, volk=0.309017, cos()=-0.309017
> > > > i=15, input=4.71239, volk=2.38419e-07, cos()=1.19249e-08
> > > > i=16, input=5.02655, volk=0.309017, cos()=0.309017
> > > > i=17, input=5.34071, volk=0.587785, cos()=0.587785
> > > > i=18, input=5.65487, volk=0.809017, cos()=0.809017
> > > > i=19, input=5.96903, volk=0.951057, cos()=0.951057
> > > >
> > > >
> > > > I am using:
> > > >
> > > > address@hidden:~/gr-acars-code/3.7.5/lib$ gnuradio
> > > > -companion -
> > > > -version
> > > > GNU Radio Companion 3.7.10git-31-gb17bcb88
> > > >
> > > > This program is part of GNU Radio
> > > > GRC comes with ABSOLUTELY NO WARRANTY.
> > > > This is free software, and you are welcome to redistribute it.
> > > >
> > > >
> > > > On:
> > > >
> > > > address@hidden:~/gr-acars-code/3.7.5/lib$ lsb_release -a
> > > > No LSB modules are available.
> > > > Distributor ID: Ubuntu
> > > > Description: Ubuntu 15.10
> > > > Release: 15.10
> > > > Codename: wily
> > > >
> > > >
> > > > With compiler:
> > > >
> > > > gcc version 5.2.1 20151010 (Ubuntu 5.2.1-22ubuntu2)
> > > >
> > > >
> > > > Here's how I compile my code:
> > > >
> > > > c++ -std=c++11 -O volk_test.cc -lvolk
> > > >
> > > >
> > > > And, of course, the code:
> > > >
> > > > #include <iostream>
> > > > #include <memory>
> > > >
> > > > extern "C" {
> > > >
> > > > #include <assert.h>
> > > >
> > > > }
> > > >
> > > > #include <volk/volk.h>
> > > >
> > > >
> > > > #define PPP 20
> > > >
> > > > int
> > > > main( void ) {
> > > >
> > > > std::unique_ptr<float,std::function<void(float*)>>
> > > > points2400((float*)volk_malloc( sizeof(float*)*( PPP + 1
> > > > ),
> > > > volk_get_alignment()),
> > > > [](float* p) {
> > > > if( p )
> > > > volk_free((void*)p );
> > > > });
> > > > assert( points2400.get());
> > > >
> > > > std::unique_ptr<float,std::function<void(float*)>>
> > > > oBuf((float*)volk_malloc( sizeof(float*)*( PPP + 1 ),
> > > > volk_get_alignment()),
> > > > [](float* p) {
> > > > if( p )
> > > > volk_free((void*)p );
> > > > });
> > > > assert( oBuf.get());
> > > >
> > > > for( int i = 0; i < PPP; ++i ) {
> > > >
> > > > const float ii = i;
> > > >
> > > > points2400.get()[i] = ii * ( 2400.0 / 48000.0 ) * ( 2.0 *
> > > > M_PI
> > > > );
> > > >
> > > > }
> > > >
> > > > volk_32f_cos_32f( oBuf.get(), points2400.get(), PPP );
> > > >
> > > > for( int i = 0; i < PPP; ++i )
> > > > std::cout << "i=" << i
> > > > << ", input=" << points2400.get()[i]
> > > > << ", volk=" << oBuf.get()[i]
> > > > << ", cos()=" << ::cos(points2400.get()[i])
> > > > <<std::endl;
> > > >
> > > >
> > > > return 0;
> > > >
> > > > }
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > Discuss-gnuradio mailing list
> > > > address@hidden
> > > > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
> > > >
> > >
> > > _______________________________________________
> > > Discuss-gnuradio mailing list
> > > address@hidden
> > > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>
>
> _______________________________________________
> Discuss-gnuradio mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio