qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Fix `lxvdsx` (issue #212)


From: Mark Cave-Ayland
Subject: Re: [PATCH] Fix `lxvdsx` (issue #212)
Date: Tue, 18 May 2021 07:53:26 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.0

On 18/05/2021 02:34, David Gibson wrote:

I'm having a hard time convincing myself this is correct in all cases.
Have you tested it with all combinations of BE/LE host and BE/LE guest
code?

The description in the ISA is pretty inscrutable, since it's in terms
of the confusing numbering if different element types in BE vs LE
mode.

It looks to me like before bcb0b7b1a1c0 this originally resolved to
MO_Q modified by ctx->default_tcg_memop_mask, which appears to depend
on the current guest endian mode.  That's pretty hard to trace through
the various layers of macros, but for reference, before bcb0b7b1a1c0
this used gen_qemu_ld64_i64(), which appears to be constructed by the
line GEN_QEMU_LOAD_64(ld64,  DEF_MEMOP(MO_Q)) in translate.c.

Richard or Giuseppe, care to weigh in?

---
  target/ppc/translate/vsx-impl.c.inc | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index b817d31260bb..46f97c029ca8 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -162,7 +162,7 @@ static void gen_lxvdsx(DisasContext *ctx)
      gen_addr_reg_index(ctx, EA);
data = tcg_temp_new_i64();
-    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_TEQ);
+    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_LEQ);
      tcg_gen_gvec_dup_i64(MO_Q, vsr_full_offset(xT(ctx->opcode)), 16, 16, 
data);
tcg_temp_free(EA);

Right. I think what is happening here is currently the load uses MO_TE (i.e. target endian) which defaults to big endian for PPC and that's why you're seeing the byte swap. The reason this is required for the vector instructions is because the vector registers need to be stored in host byte-order to allow them to make use of the host's vector instructions.

A quick look around the same file at gen_lxvw4x() suggests that you need a solution like this to work correctly with both big and little endian:

    if (ctx->le_mode) {
        tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_LEQ);
    } else {
        tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_BEQ);
    }

The original commit message for bcb0b7b1a1c0 mentions that the implementation is based upon that of the lxvwsx instruction, so I'm fairly sure that gen_lxvwsx() is also broken for little endian and will need a similar fix too.


ATB,

Mark.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]