qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TCG performance on PPC64


From: Richard Henderson
Subject: Re: TCG performance on PPC64
Date: Wed, 18 May 2022 09:09:43 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0

On 5/18/22 07:11, Mark Cave-Ayland wrote:
Finally another comment from Richard about vector instruction use from [2]: "As an aside, this does suggest to me that target/ppc might be well served in moving the ppc_vsr_t members of CPUPPCState earlier, so that this offset is smaller". Presumably this is because calculating smaller offsets can be done using fewer instructions? However I suppose this would only have an effect on vector-heavy workloads.

Yes, the offsets, quoting from [2],

 ld_vec v128,e8,tmp2,env,$0xd6b0
 st_vec v128,e8,tmp2,env,$0xd4c0

being larger than 0x7fff, require two insns to load.

It's not just vectors, but fp, since the register space is shared.
I think just moving the two spr arrays toward the end of CPUArchState would do 
that job.

But I wouldn't expect it to matter *that* much.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]