[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [gforth] Performance anomality with dynamic superinstructions on MIP
From: |
Bernd Paysan |
Subject: |
Re: [gforth] Performance anomality with dynamic superinstructions on MIPSel |
Date: |
Mon, 24 Mar 2014 03:17:41 +0100 |
User-agent: |
KMail/4.11.5 (Linux/3.11.10-7-desktop; KDE/4.11.5; x86_64; ; ) |
Am Sonntag, 23. März 2014, 19:46:24 schrieb Bernd Paysan:
> Am Sonntag, 23. März 2014, 18:38:58 schrieb David Kuehling:
> > Replying to myself, quick update (before I have to shutdown my computer
> > for today):
> >
> > The instruction in question is 'rdhwr v1,$29' which is mips32r2, i.e.
> >
> > not supported on Loongson2f. GCC outputs it via a sequence like:
> > .set push
> > .set mips32r2
> > rdhwr $3,$29
> > .set pop
> >
> > I guess on MIPS the GCC runtime nowadays uses model specific register
> > $29 (which is not CPU reg $29 !) for addressing thread local storage.
> > To support older mipses this is implemented in kernel via an invalid
> > opcode interrupt emulation. I.e. very slow. How can we prevent writes
> > to thread local storage from creeping into goto*?
>
> This stuff is copied from the first NEXT, i.e. the thing between
> before_goto: and after_goto:
>
> #define FIRST_NEXT_P2 NEXT_P1_5; GOTO_ALIGN; \
> before_goto: goto *real_ca; after_goto:
>
> Suggestion: Add a "asm volatile("": : :"memory")" before "before_goto:"
>
> That should scare GCC to move stuff behind it.
I've looked at what ARM and x86_64 GCC do, and they also move in some stuff,
x86_64 less, ARM more. It's not as bad as your case (with an emulated
function), but it's still stuff. asm __volatile__ ("": : :"memory") doesn't
prevent it. Neither does calling a dummy function.
What did the trick? Using FIRST_NEXT actually in after_last:, this is a dummy
for getting the tail of the last address, we can put anything we like there.
Doing FIRST_NEXT there makes it a noop, and since there's nothing to move into
the goto, it stays as small as it should.
On the Core i7, I see no difference (the two leas and the one write are
swallowed by the sheer power of the Core i7), but on my Galaxy Note II, this
gives a very clear and significant speedup:
0.575 0.710 0.365 0.750 0.390 2014-03-24; Exynos 4 Quad 1.6GHz; gcc-4.8.x
(Android 4.3)
0.735 0.920 0.900 1.110 0.690 2012-10-31; Exynos 4 Quad 1.6GHz; gcc-4.6.x
(Android 4.1.1)
--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/
signature.asc
Description: This is a digitally signed message part.
- [gforth] Performance anomality with dynamic superinstructions on MIPSel, David Kuehling, 2014/03/22
- Re: [gforth] Performance anomality with dynamic superinstructions on MIPSel, Bernd Paysan, 2014/03/22
- Re: [gforth] Performance anomality with dynamic superinstructions on MIPSel, David Kuehling, 2014/03/22
- Re: [gforth] Performance anomality with dynamic superinstructions on MIPSel, Anton Ertl, 2014/03/23
- Re: [gforth] Performance anomality with dynamic superinstructions on MIPSel, David Kuehling, 2014/03/23
- Re: [gforth] Performance anomality with dynamic superinstructions on MIPSel, David Kuehling, 2014/03/23
- Re: [gforth] Performance anomality with dynamic superinstructions on MIPSel, Bernd Paysan, 2014/03/23
- Re: [gforth] Performance anomality with dynamic superinstructions on MIPSel,
Bernd Paysan <=
- Re: [gforth] Performance anomality with dynamic superinstructions on MIPSel, David Kuehling, 2014/03/23
- Re: [gforth] Performance anomality with dynamic superinstructions on MIPSel, Bernd Paysan, 2014/03/24
- Re: [gforth] Performance anomality with dynamic superinstructions on MIPSel, Anton Ertl, 2014/03/24
- Re: [gforth] Performance anomality with dynamic superinstructions on MIPSel, Bernd Paysan, 2014/03/24
- Re: [gforth] Performance anomality with dynamic superinstructions on MIPSel, Anton Ertl, 2014/03/25