[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-stable] [PATCH 1/5] tcg/arm: fix TLB access in qemu-ld/st ops
From: |
Peter Maydell |
Subject: |
Re: [Qemu-stable] [PATCH 1/5] tcg/arm: fix TLB access in qemu-ld/st ops |
Date: |
Tue, 9 Oct 2012 22:13:35 +0100 |
On 9 October 2012 21:30, Aurelien Jarno <address@hidden> wrote:
> The TCG arm backend considers likely that the offset to the TLB
> entries does not exceed 12 bits for mem_index = 0. In practice this is
> not true for at list the MIPS target.
>
> The current patch fixes that by loading the bits 23-12 with a separate
> instruction, and using loads with address writeback, independently of
> the value of mem_idx. In total this allow a 24-bit offset, which is a
> lot more than needed.
Would probably be good to assert() that the bits above 24 are zero,
though.
> Cc: Andrzej Zaborowski <address@hidden>
> Cc: Peter Maydell <address@hidden>
> Cc: address@hidden
> Signed-off-by: Aurelien Jarno <address@hidden>
> ---
> tcg/arm/tcg-target.c | 73
> +++++++++++++++++++++++++-------------------------
> 1 file changed, 37 insertions(+), 36 deletions(-)
>
> diff --git a/tcg/arm/tcg-target.c b/tcg/arm/tcg-target.c
> index 737200e..6cde512 100644
> --- a/tcg/arm/tcg-target.c
> +++ b/tcg/arm/tcg-target.c
> @@ -624,6 +624,19 @@ static inline void tcg_out_ld32_12(TCGContext *s, int
> cond,
> (rn << 16) | (rd << 12) | ((-im) & 0xfff));
> }
>
> +/* Offset pre-increment with base writeback. */
> +static inline void tcg_out_ld32_12wb(TCGContext *s, int cond,
> + int rd, int rn, tcg_target_long im)
> +{
Worth asserting that rd != rn ? (that's an UNPREDICTABLE case
for ldr with writeback)
> + if (im >= 0) {
> + tcg_out32(s, (cond << 28) | 0x05b00000 |
> + (rn << 16) | (rd << 12) | (im & 0xfff));
> + } else {
> + tcg_out32(s, (cond << 28) | 0x05300000 |
> + (rn << 16) | (rd << 12) | ((-im) & 0xfff));
> + }
you could avoid the if() by just setting the U bit using "(im >= 0) << 23"
Clearer? Dunno.
Looks OK otherwise (though I haven't tested it.)
Random aside: we emit a pretty long string of COND_EQ predicated
code in these functions, especially in the 64 bit address and
byteswapped cases. It might be interesting to see if performance
on an A9, say, was any better if we just did a conditional branch
over it instead... Probably borderline to no-effect, though.
-- PMM