[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Rapp-dev] Full SIMD implementation of integral
From: |
Willie Betschart |
Subject: |
[Rapp-dev] Full SIMD implementation of integral |
Date: |
Fri, 14 Dec 2012 16:28:55 +0100 |
Hello RAPP -Dev!
Here's a patch containing a full SIMD implementation of integral.
I also added SWAR macros for type conversions, mainly because benchmarking
build didn't went through SWAR build, not sure why this was built.
I added two new macro's, SPLAT_U16 and SPLAT_U32. I have tested this separately
but got difficulties adding unit tests so I wait with that. Splat is used to
add the previous state when next buffers are processed.
The integral was faster than generic but still not as efficient as the hybrid
of hardcoded swar and SIMD. Also I needed ssse3's align, sse2 wasn't that fast.
I added a description in the source /compute/vector/rc_integral.c how it works.
Best wishes
Willie
full_simd_integral.patch
Description: full_simd_integral.patch
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Rapp-dev] Full SIMD implementation of integral,
Willie Betschart <=