|
From: | Aaron Krister Johnson |
Subject: | Re: [gforth] Raspberry Pi/ARM performance |
Date: | Tue, 01 May 2018 23:30:28 +0000 |
On Tue, May 01, 2018 at 06:49:51PM +0000, Aaron Krister Johnson wrote:
> 4) using gforth as "glue", and dropping to C or Assembler for relief from
> performance bottlenecks. (Even though interpreted Forth is second only to C
> for speed)
That can certainly help, if much of the time is spent in a few inner loops.
> 5) Optimising for the ARM's VFP (Neon) architecture, which I believe would
> have to be done via one of these ways (or a combination):
> a) compiling gforth with certain GCC flags for optimising against ARM
> b) making direct assembly calls to Neon instructions (is this possible in
> gforth currently?)
> c) somehow linking in C object code that is floating-point optimized.
Actually I have been working on a vector wordset for Forth
<http://www.complang.tuwien.ac.at/anton/euroforth/ef17/genproceedings/papers/ertl.pdf>,
with source code at <https://github.com/AntonErtl/vectors>.
The Gforth implementation of the wordset generates GCC functions for
the inner loops of the vector words, and these functions use the GCC
vector extensions
<https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html> to
generate code with SIMD instructions.
The implementation is not yet complete, but you could try that out and
if you need the implementation of more words, or need words that are
not yet planned for the wordset, let me know. I also plan to improve
performance some more in the next 1.5 weeks.
- anton
[Prev in Thread] | Current Thread | [Next in Thread] |