qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 5/5] tcg/sparc: Support unaligned access for user-only


From: Richard Henderson
Subject: Re: [PATCH v4 5/5] tcg/sparc: Support unaligned access for user-only
Date: Sat, 5 Feb 2022 07:55:54 +1100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0

On 2/5/22 06:07, Peter Maydell wrote:
+    /*
+     * Overalignment: When we're asking for really large alignment,
+     * the actual access is always done above and all we need to do
+     * here is invoke the handler for SIGBUS.
+     */

I thought the access was in an annulled delay slot and so won't
be "done above" ?

Bad wording, I suppose. If the alignment check succeeds, then access is done above. If the alignment check fails, there is no access to be performed, only to report failure.

+    switch ((unsigned)memop) {
+    case MO_BEUW | MO_UNALN:
+    case MO_BESW | MO_UNALN:
+    case MO_BEUL | MO_ALIGN_2:
+    case MO_BESL | MO_ALIGN_2:
+    case MO_BEUQ | MO_ALIGN_4:
+        /* Two loads: shift and combine. */
+        tcg_out_ldst(s, TCG_REG_T2, TCG_REG_T1, 0,
+                        qemu_ld_opc[a_bits | MO_BE | (memop & MO_SIGN)]);
+        tcg_out_ldst(s, data, TCG_REG_T1, 1 << a_bits,
+                        qemu_ld_opc[a_bits | MO_BE]);
+        tcg_out_arithi(s, TCG_REG_T2, TCG_REG_T2, 8 << a_bits, SHIFT_SLLX);

Why are we calculating the offset in memory of the second half of
the data and the amount to shift it by using the alignment-bits
rather than the size-bits ? Because of the cases we know that
here a_bits == s_bits - 1, but I think it would be clearer to
work in terms of the size.

Ok.

+        tcg_out_arithi(s, TCG_REG_T1, TCG_REG_T1, 3, ARITH_ANDN);
+        tcg_out_ldst(s, TCG_REG_T2, TCG_REG_T1, 0, LDUW);

Doesn't this give the wrong fault-address value to the guest
(ie not the address it used for the load, but a rounded-down one)
if we take a SIGSEGV? Or do we fix that up elsewhere?

Oops, no. Perhaps a single byte load to the zero register would fix that, without having to go to full load-by-parts.

+    case MO_BEUQ | MO_ALIGN_2:
+        /*
+         * An extra test to verify alignment 2 is 5 insns, which
+         * is more than we would save by using the slightly smaller
+         * unaligned sequence above.
+         */
+        tcg_out_ldst(s, data, TCG_REG_T1, 0, LDUH);
+        for (int i = 2; i < 8; i += 2) {
+            tcg_out_ldst(s, TCG_REG_T2, TCG_REG_T1, i, LDUW);

Isn't this loading 2 + 3 * 4 == 14 bytes?

Oops.  Got confused with the qemu vs sparc "word" there for a moment.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]