|
From: | Richard Henderson |
Subject: | Re: [PATCH v4 5/5] tcg/sparc: Support unaligned access for user-only |
Date: | Sat, 5 Feb 2022 07:55:54 +1100 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 |
On 2/5/22 06:07, Peter Maydell wrote:
+ /* + * Overalignment: When we're asking for really large alignment, + * the actual access is always done above and all we need to do + * here is invoke the handler for SIGBUS. + */I thought the access was in an annulled delay slot and so won't be "done above" ?
Bad wording, I suppose. If the alignment check succeeds, then access is done above. If the alignment check fails, there is no access to be performed, only to report failure.
+ switch ((unsigned)memop) { + case MO_BEUW | MO_UNALN: + case MO_BESW | MO_UNALN: + case MO_BEUL | MO_ALIGN_2: + case MO_BESL | MO_ALIGN_2: + case MO_BEUQ | MO_ALIGN_4: + /* Two loads: shift and combine. */ + tcg_out_ldst(s, TCG_REG_T2, TCG_REG_T1, 0, + qemu_ld_opc[a_bits | MO_BE | (memop & MO_SIGN)]); + tcg_out_ldst(s, data, TCG_REG_T1, 1 << a_bits, + qemu_ld_opc[a_bits | MO_BE]); + tcg_out_arithi(s, TCG_REG_T2, TCG_REG_T2, 8 << a_bits, SHIFT_SLLX);Why are we calculating the offset in memory of the second half of the data and the amount to shift it by using the alignment-bits rather than the size-bits ? Because of the cases we know that here a_bits == s_bits - 1, but I think it would be clearer to work in terms of the size.
Ok.
+ tcg_out_arithi(s, TCG_REG_T1, TCG_REG_T1, 3, ARITH_ANDN); + tcg_out_ldst(s, TCG_REG_T2, TCG_REG_T1, 0, LDUW);Doesn't this give the wrong fault-address value to the guest (ie not the address it used for the load, but a rounded-down one) if we take a SIGSEGV? Or do we fix that up elsewhere?
Oops, no. Perhaps a single byte load to the zero register would fix that, without having to go to full load-by-parts.
+ case MO_BEUQ | MO_ALIGN_2: + /* + * An extra test to verify alignment 2 is 5 insns, which + * is more than we would save by using the slightly smaller + * unaligned sequence above. + */ + tcg_out_ldst(s, data, TCG_REG_T1, 0, LDUH); + for (int i = 2; i < 8; i += 2) { + tcg_out_ldst(s, TCG_REG_T2, TCG_REG_T1, i, LDUW);Isn't this loading 2 + 3 * 4 == 14 bytes?
Oops. Got confused with the qemu vs sparc "word" there for a moment. r~
[Prev in Thread] | Current Thread | [Next in Thread] |