qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] 3a7a2b: target/arm: Use tcg_gen_gvec_bitsel


From: Peter Maydell
Subject: [Qemu-commits] [qemu/qemu] 3a7a2b: target/arm: Use tcg_gen_gvec_bitsel
Date: Thu, 13 Jun 2019 08:15:49 -0700

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: 3a7a2b4e5cf0d49cd8b14e8225af0310068b7d20
      
https://github.com/qemu/qemu/commit/3a7a2b4e5cf0d49cd8b14e8225af0310068b7d20
  Author: Richard Henderson <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-a64.c
    M target/arm/translate-a64.h
    M target/arm/translate.c
    M target/arm/translate.h

  Log Message:
  -----------
  target/arm: Use tcg_gen_gvec_bitsel

This replaces 3 target-specific implementations for BIT, BIF, and BSL.

Signed-off-by: Richard Henderson <address@hidden>
Reviewed-by: Peter Maydell <address@hidden>
Message-id: address@hidden
Signed-off-by: Peter Maydell <address@hidden>


  Commit: fc1120a7f5f2d4b601003205c598077d3eb11ad2
      
https://github.com/qemu/qemu/commit/fc1120a7f5f2d4b601003205c598077d3eb11ad2
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/helper.c

  Log Message:
  -----------
  target/arm: Implement NSACR gating of floating point

The NSACR register allows secure code to configure the FPU
to be inaccessible to non-secure code. If the NSACR.CP10
bit is set then:
 * NS accesses to the FPU trap as UNDEF (ie to NS EL1 or EL2)
 * CPACR.{CP10,CP11} behave as if RAZ/WI
 * HCPTR.{TCP11,TCP10} behave as if RAO/WI

Note that we do not implement the NSACR.NSASEDIS bit which
gates only access to Advanced SIMD, in the same way that
we don't implement the equivalent CPACR.ASEDIS and HCPTR.TASE.

Reviewed-by: Richard Henderson <address@hidden>
Signed-off-by: Peter Maydell <address@hidden>
Message-id: address@hidden


  Commit: 97fb318d37be4d21125e89c96e4e92ea33beac51
      
https://github.com/qemu/qemu/commit/97fb318d37be4d21125e89c96e4e92ea33beac51
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M hw/arm/smmuv3.c

  Log Message:
  -----------
  hw/arm/smmuv3: Fix decoding of ID register range

The SMMUv3 ID registers cover an area 0x30 bytes in size
(12 registers, 4 bytes each). We were incorrectly decoding
only the first 0x20 bytes.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Eric Auger <address@hidden>
Message-id: address@hidden


  Commit: be1ba4d56eba5666ee03b40e286d7315862ab188
      
https://github.com/qemu/qemu/commit/be1ba4d56eba5666ee03b40e286d7315862ab188
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M hw/core/bus.c

  Log Message:
  -----------
  hw/core/bus.c: Only the main system bus can have no parent

In commit 80376c3fc2c38fdd453 in 2010 we added a workaround for
some qbus buses not being connected to qdev devices -- if the
bus has no parent object then we register a reset function which
resets the bus on system reset (and unregister it when the
bus is unparented).

Nearly a decade later, we have now no buses in the tree which
are created with non-NULL parents, so we can remove the
workaround and instead just assert that if the bus has a NULL
parent then it is the main system bus.

(The absence of other parentless buses was confirmed by
code inspection of all the callsites of qbus_create() and
qbus_create_inplace() and cross-checked by 'make check'.)

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Markus Armbruster <address@hidden>
Reviewed-by: Philippe Mathieu-Daudé <address@hidden>
Reviewed-by: Damien Hedde <address@hidden>
Tested-by: Philippe Mathieu-Daudé <address@hidden>
Message-id: address@hidden


  Commit: d67ebada159148bfdfde84871338738e4465e985
      
https://github.com/qemu/qemu/commit/d67ebada159148bfdfde84871338738e4465e985
  Author: Richard Henderson <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/pauth_helper.c
    M tests/tcg/aarch64/Makefile.target
    A tests/tcg/aarch64/pauth-2.c

  Log Message:
  -----------
  target/arm: Fix output of PAuth Auth

The ARM pseudocode installs the error_code into the original
pointer, not the encrypted pointer.  The difference applies
within the 7 bits of pac data; the result should be the sign
extension of bit 55.

Add a testcase to that effect.

Signed-off-by: Richard Henderson <address@hidden>
Reviewed-by: Peter Maydell <address@hidden>
Signed-off-by: Peter Maydell <address@hidden>


  Commit: 2c7d442743854d2c1f5475446e088bd523f4bb20
      
https://github.com/qemu/qemu/commit/2c7d442743854d2c1f5475446e088bd523f4bb20
  Author: Richard Henderson <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M scripts/decodetree.py

  Log Message:
  -----------
  decodetree: Fix comparison of Field

Typo comparing the sign of the field, twice, instead of also comparing
the mask of the field (which itself encodes both position and length).

Reported-by: Peter Maydell <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>
Message-id: address@hidden
Reviewed-by: Peter Maydell <address@hidden>
Reviewed-by: Philippe Mathieu-Daudé <address@hidden>
Signed-off-by: Peter Maydell <address@hidden>


  Commit: 78e138bc1f672c145ef6ace74617db00eebaa2ba
      
https://github.com/qemu/qemu/commit/78e138bc1f672c145ef6ace74617db00eebaa2ba
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/Makefile.objs
    A target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    A target/arm/vfp-uncond.decode
    A target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Add stubs for AArch32 VFP decodetree

Add the infrastructure for building and invoking a decodetree decoder
for the AArch32 VFP encodings.  At the moment the new decoder covers
nothing, so we always fall back to the existing hand-written decode.

We need to have one decoder for the unconditional insns and one for
the conditional insns, as otherwise the patterns for conditional
insns would incorrectly match against the unconditional ones too.

Since translate.c is over 14,000 lines long and we're going to be
touching pretty much every line of the VFP code as part of the
decodetree conversion, we create a new translate-vfp.inc.c to hold
the code which deals with VFP in the new scheme.  It should be
possible to convert this into a standalone translation unit
eventually, but the conversion process will be much simpler if we
simply #include it midway through translate.c to start with.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 06db8196bba34776829020192ed623a0b22e6557
      
https://github.com/qemu/qemu/commit/06db8196bba34776829020192ed623a0b22e6557
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c

  Log Message:
  -----------
  target/arm: Factor out VFP access checking code

Factor out the VFP access checking code so that we can use it in the
leaf functions of the decodetree decoder.

We call the function full_vfp_access_check() so we can keep
the more natural vfp_access_check() for a version which doesn't
have the 'ignore_vfp_enabled' flag -- that way almost all VFP
insns will be able to use vfp_access_check(s) and only the
special-register access function will have to use
full_vfp_access_check(s, ignore_vfp_enabled).

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 3de79d335c9aa7d726865e3933d9b21781032183
      
https://github.com/qemu/qemu/commit/3de79d335c9aa7d726865e3933d9b21781032183
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/cpu.c

  Log Message:
  -----------
  target/arm: Fix Cortex-R5F MVFR values

The Cortex-R5F initfn was not correctly setting up the MVFR
ID register values. Fill these in, since some subsequent patches
will use ID register checks rather than CPU feature bit checks.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 973751fd798d41402d34f9f705c0c6d1633d0cda
      
https://github.com/qemu/qemu/commit/973751fd798d41402d34f9f705c0c6d1633d0cda
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/cpu.c

  Log Message:
  -----------
  target/arm: Explicitly enable VFP short-vectors for aarch32 -cpu max

At the moment our -cpu max for AArch32 supports VFP short-vectors
because we always implement them, even for CPUs which should
not have them. The following commits are going to switch to
using the correct ID-register-check to enable or disable short
vector support, so we need to turn it on explicitly for -cpu max,
because Cortex-A15 doesn't implement it.

We don't enable this for the AArch64 -cpu max, because the v8A
architecture never supports short-vectors.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: b3ff4b87b4ae08120a51fe12592725e1dca8a085
      
https://github.com/qemu/qemu/commit/b3ff4b87b4ae08120a51fe12592725e1dca8a085
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/cpu.h
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp-uncond.decode

  Log Message:
  -----------
  target/arm: Convert the VSEL instructions to decodetree

Convert the VSEL instructions to decodetree.
We leave trans_VSEL() in translate.c for now as this allows
the patch to show just the changes from the old handle_vsel().

In the old code the check for "do D16-D31 exist" was hidden in
the VFP_DREG macro, and assumed that VFPv3 always implied that
D16-D31 exist. In the new code we do the correct ID register test.
This gives identical behaviour for most of our CPUs, and fixes
previously incorrect handling for  Cortex-R5F, Cortex-M4 and
Cortex-M33, which all implement VFPv3 or better with only 16
double-precision registers.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: f65988a1efdb42f9058db44297591491842e697c
      
https://github.com/qemu/qemu/commit/f65988a1efdb42f9058db44297591491842e697c
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate.c
    M target/arm/vfp-uncond.decode

  Log Message:
  -----------
  target/arm: Convert VMINNM, VMAXNM to decodetree

Convert the VMINNM and VMAXNM instructions to decodetree.
As with VSEL, we leave the trans_VMINMAXNM() function
in translate.c for the moment.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: e3bb599d16e4678b228d80194cee328f894b1ceb
      
https://github.com/qemu/qemu/commit/e3bb599d16e4678b228d80194cee328f894b1ceb
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate.c
    M target/arm/vfp-uncond.decode

  Log Message:
  -----------
  target/arm: Convert VRINTA/VRINTN/VRINTP/VRINTM to decodetree

Convert the VRINTA/VRINTN/VRINTP/VRINTM instructions to decodetree.
Again, trans_VRINT() is temporarily left in translate.c.

Signed-off-by: Peter Maydell <address@hidden>Reviewed-by: Richard Henderson 
<address@hidden>


  Commit: c2a46a914cd5c38fd0ee57ff0befc1c5bde27bcf
      
https://github.com/qemu/qemu/commit/c2a46a914cd5c38fd0ee57ff0befc1c5bde27bcf
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate.c
    M target/arm/vfp-uncond.decode

  Log Message:
  -----------
  target/arm: Convert VCVTA/VCVTN/VCVTP/VCVTM to decodetree

Convert the VCVTA/VCVTN/VCVTP/VCVTM instructions to decodetree.
trans_VCVT() is temporarily left in translate.c.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: f7bbb8f31f0761edbf0c64b7ab3c3f49c13612ea
      
https://github.com/qemu/qemu/commit/f7bbb8f31f0761edbf0c64b7ab3c3f49c13612ea
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c

  Log Message:
  -----------
  target/arm: Move the VFP trans_* functions to translate-vfp.inc.c

Move the trans_*() functions we've just created from translate.c
to translate-vfp.inc.c. This is pure code motion with no textual
changes (this can be checked with 'git show --color-moved').

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 160f3b64c5cc4c8a09a1859edc764882ce6ad6bf
      
https://github.com/qemu/qemu/commit/160f3b64c5cc4c8a09a1859edc764882ce6ad6bf
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c

  Log Message:
  -----------
  target/arm: Add helpers for VFP register loads and stores

The current VFP code has two different idioms for
loading and storing from the VFP register file:
 1 using the gen_mov_F0_vreg() and similar functions,
   which load and store to a fixed set of TCG globals
   cpu_F0s, CPU_F0d, etc
 2 by direct calls to tcg_gen_ld_f64() and friends

We want to phase out idiom 1 (because the use of the
fixed globals is a relic of a much older version of TCG),
but idiom 2 is quite longwinded:
 tcg_gen_ld_f64(tmp, cpu_env, vfp_reg_offset(true, reg))
requires us to specify the 64-bitness twice, once in
the function name and once by passing 'true' to
vfp_reg_offset(). There's no guard against accidentally
passing the wrong flag.

Instead, let's move to a convention of accessing 64-bit
registers via the existing neon_load_reg64() and
neon_store_reg64(), and provide new neon_load_reg32()
and neon_store_reg32() for the 32-bit equivalents.

Implement the new functions and use them in the code in
translate-vfp.inc.c. We will convert the rest of the VFP
code as we do the decodetree conversion in subsequent
commits.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 9851ed9269d214c0c6feba960dd14ff09e6c34b4
      
https://github.com/qemu/qemu/commit/9851ed9269d214c0c6feba960dd14ff09e6c34b4
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert "double-precision" register moves to decodetree

Convert the "double-precision" register moves to decodetree:
this covers VMOV scalar-to-gpreg, VMOV gpreg-to-scalar and VDUP.

Note that the conversion process has tightened up a few of the
UNDEF encoding checks: we now correctly forbid:
 * VMOV-to-gpr with U:opc1:opc2 == 10x00 or x0x10
 * VMOV-from-gpr with opc1:opc2 == 0x10
 * VDUP with B:E == 11
 * VDUP with Q == 1 and Vn<0> == 1

Signed-off-by: Peter Maydell <address@hidden>
---
The accesses of elements < 32 bits could be improved by doing
direct ld/st of the right size rather than 32-bit read-and-shift
or read-modify-write, but we leave this for later cleanup,
since this series is generally trying to stick to fixing
the decode.
Reviewed-by: Richard Henderson <address@hidden>


  Commit: a9ab50011aeda2dd012da99069e078379315ea18
      
https://github.com/qemu/qemu/commit/a9ab50011aeda2dd012da99069e078379315ea18
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert "single-precision" register moves to decodetree

Convert the "single-precision" register moves to decodetree:
 * VMSR
 * VMRS
 * VMOV between general purpose register and single precision

Note that the VMSR/VMRS conversions make our handling of
the "should this UNDEF?" checks consistent between the two
instructions:
 * VMSR to MVFR0, MVFR1, MVFR2 now UNDEF from EL0
   (previously was a nop)
 * VMSR to FPSID now UNDEFs from EL0 or if VFPv3 or better
   (previously was a nop)
 * VMSR to FPINST and FPINST2 now UNDEF if VFPv3 or better
   (previously would write to the register, which had no
   guest-visible effect because we always UNDEF reads)

We also tighten up the decode: we were previously underdecoding
some SBZ or SBO bits.

The conversion of VMOV_single includes the expansion out of the
gen_mov_F0_vreg()/gen_vfp_mrs() and gen_mov_vreg_F0()/gen_vfp_msr()
sequences into the simpler direct load/store of the TCG temp via
neon_{load,store}_reg32(): we know in the new function that we're
always single-precision, we don't need to use the old-and-deprecated
cpu_F0* TCG globals, and we don't happen to have the declaration of
gen_vfp_msr() and gen_vfp_mrs() at the point in the file where the
new function is.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 81f681106eabe21c55118a5a41999fb7387fb714
      
https://github.com/qemu/qemu/commit/81f681106eabe21c55118a5a41999fb7387fb714
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VFP two-register transfer insns to decodetree

Convert the VFP two-register transfer instructions to decodetree
(in the v8 Arm ARM these are the "Advanced SIMD and floating-point
64-bit move" encoding group).

Again, we expand out the sequences involving gen_vfp_msr() and
gen_msr_vfp().

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 79b02a3b5231c5b8cd31e50cd549968dd0a05c49
      
https://github.com/qemu/qemu/commit/79b02a3b5231c5b8cd31e50cd549968dd0a05c49
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VFP VLDR and VSTR to decodetree

Convert the VFP single load/store insns VLDR and VSTR to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: fa288de272c5c8a66d5eb683b123706a52bc7ad6
      
https://github.com/qemu/qemu/commit/fa288de272c5c8a66d5eb683b123706a52bc7ad6
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert the VFP load/store multiple insns to decodetree

Convert the VFP load/store multiple insns to decodetree.
This includes tightening up the UNDEF checking for pre-VFPv3
CPUs which only have D0-D15 : they now UNDEF for any access
to D16-D31, not merely when the smallest register in the
transfer list is in D16-D31.

This conversion does not try to share code between the single
precision and the double precision versions; this looks a bit
duplicative of code, but it leaves the door open for a future
refactoring which gets rid of the use of the "F0" registers
by inlining the various functions like gen_vfp_ld() and
gen_mov_F0_reg() which are hiding "if (dp) { ... } else { ... }"
conditionalisation.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 3993d0407dff7233e42f2251db971e126a0497e9
      
https://github.com/qemu/qemu/commit/3993d0407dff7233e42f2251db971e126a0497e9
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c

  Log Message:
  -----------
  target/arm: Remove VLDR/VSTR/VLDM/VSTM use of cpu_F0s and cpu_F0d

Expand out the sequences in the new decoder VLDR/VSTR/VLDM/VSTM trans
functions which perform the memory accesses by going via the TCG
globals cpu_F0s and cpu_F0d, to use local TCG temps instead.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 266bd25c485597c94209bfdb3891c1d0c573c164
      
https://github.com/qemu/qemu/commit/266bd25c485597c94209bfdb3891c1d0c573c164
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/cpu.h
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VFP VMLA to decodetree

Convert the VFP VMLA instruction to decodetree.

This is the first of the VFP 3-operand data processing instructions,
so we include in this patch the code which loops over the elements
for an old-style VFP vector operation. The existing code to do this
looping uses the deprecated cpu_F0s/F0d/F1s/F1d TCG globals; since
we are going to be converting instructions one at a time anyway
we can take the opportunity to make the new loop use TCG temporaries,
which means we can do that conversion one operation at a time
rather than needing to do it all in one go.

We include an UNDEF check which was missing in the old code:
short-vector operations (with stride or length non-zero) were
deprecated in v7A and must UNDEF in v8A, so if the MVFR0 FPShVec
field does not indicate that support for short vectors is present
we UNDEF the operations that would use them. (This is a change
of behaviour for Cortex-A7, Cortex-A15 and the v8 CPUs, which
previously were all incorrectly allowing short-vector operations.)

Note that the conversion fixes a bug in the old code for the
case of VFP short-vector "mixed scalar/vector operations". These
happen where the destination register is in a vector bank but
but the second operand is in a scalar bank. For example
  vmla.f64 d10, d1, d16   with length 2 stride 2
is equivalent to the pair of scalar operations
  vmla.f64 d10, d1, d16
  vmla.f64 d8, d3, d16
where the destination and first input register cycle through
their vector but the second input is scalar (d16). In the
old decoder the gen_vfp_F1_mul() operation uses cpu_F1{s,d}
as a temporary output for the multiply, which trashes the
second input operand. For the fully-scalar case (where we
never do a second iteration) and the fully-vector case
(where the loop loads the new second input operand) this
doesn't matter, but for the mixed scalar/vector case we
will end up using the wrong value for later loop iterations.
In the new code we use TCG temporaries and so avoid the bug.
This bug is present for all the multiply-accumulate insns
that operate on short vectors: VMLA, VMLS, VNMLA, VNMLS.

Note 2: the expression used to calculate the next register
number in the vector bank is not in fact correct; we leave
this behaviour unchanged from the old decoder and will
fix this bug later in the series.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: e7258280d46af4ab6a0cc93ccfe8f6614defb4b7
      
https://github.com/qemu/qemu/commit/e7258280d46af4ab6a0cc93ccfe8f6614defb4b7
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VFP VMLS to decodetree

Convert the VFP VMLS instruction to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: c54a416cc6d60efbc79dd37aaf0c8918c05b5815
      
https://github.com/qemu/qemu/commit/c54a416cc6d60efbc79dd37aaf0c8918c05b5815
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VFP VNMLS to decodetree

Convert the VFP VNMLS instruction to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 8a483533adc1bdc2decb8f456dbe930a2d245a8b
      
https://github.com/qemu/qemu/commit/8a483533adc1bdc2decb8f456dbe930a2d245a8b
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VFP VNMLA to decodetree

Convert the VFP VNMLA instruction to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 88c5188ced60e9f2b8cc3af3b9bc4a8031c8c996
      
https://github.com/qemu/qemu/commit/88c5188ced60e9f2b8cc3af3b9bc4a8031c8c996
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VMUL to decodetree

Convert the VMUL instruction to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 43c4be1236c105090d134540da1036073d157cd4
      
https://github.com/qemu/qemu/commit/43c4be1236c105090d134540da1036073d157cd4
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VNMUL to decodetree

Convert the VNMUL instruction to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: ce28b303716e7eca3f3765bf6776d722ebbe1122
      
https://github.com/qemu/qemu/commit/ce28b303716e7eca3f3765bf6776d722ebbe1122
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VADD to decodetree

Convert the VADD instruction to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 8fec9a119264b7936503abce3c106fad7e3ccb76
      
https://github.com/qemu/qemu/commit/8fec9a119264b7936503abce3c106fad7e3ccb76
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VSUB to decodetree

Convert the VSUB instruction to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 519ee7ae31e050eb0ff9ad35c213f0bd7ab1c03e
      
https://github.com/qemu/qemu/commit/519ee7ae31e050eb0ff9ad35c213f0bd7ab1c03e
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VDIV to decodetree

Convert the VDIV instruction to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: d4893b01d23060845ee3855bc96626e16aad9ab5
      
https://github.com/qemu/qemu/commit/d4893b01d23060845ee3855bc96626e16aad9ab5
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VFP fused multiply-add insns to decodetree

Convert the VFP fused multiply-add instructions (VFNMA, VFNMS,
VFMA, VFMS) to decodetree.

Note that in the old decode structure we were implementing
these to honour the VFP vector stride/length. These instructions
were introduced in VFPv4, and in the v7A architecture they
are UNPREDICTABLE if the vector stride or length are non-zero.
In v8A they must UNDEF if stride or length are non-zero, like
all VFP instructions; we choose to UNDEF always.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: b518c753f0b94e14e01e97b4ec42c100dafc0cc2
      
https://github.com/qemu/qemu/commit/b518c753f0b94e14e01e97b4ec42c100dafc0cc2
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VMOV (imm) to decodetree

Convert the VFP VMOV (immediate) instruction to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 90287e22c987e9840704345ed33d237cbe759dd9
      
https://github.com/qemu/qemu/commit/90287e22c987e9840704345ed33d237cbe759dd9
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VABS to decodetree

Convert the VFP VABS instruction to decodetree.

Unlike the 3-op versions, we don't pass fpst to the VFPGen2OpSPFn or
VFPGen2OpDPFn because none of the operations which use this format
and support short vectors will need it.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 1882651afdb0ca44f0631192fbe65a71c660d809
      
https://github.com/qemu/qemu/commit/1882651afdb0ca44f0631192fbe65a71c660d809
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VNEG to decodetree

Convert the VNEG instruction to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: b8474540cbce4e2fa45010416375d1bcbe86dc15
      
https://github.com/qemu/qemu/commit/b8474540cbce4e2fa45010416375d1bcbe86dc15
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VSQRT to decodetree

Convert the VSQRT instruction to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 17552b979ebb9848a534c25ebed18a1072710058
      
https://github.com/qemu/qemu/commit/17552b979ebb9848a534c25ebed18a1072710058
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VMOV (register) to decodetree

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 386bba2368842fc74388a3c1651c6c0c0c70adbd
      
https://github.com/qemu/qemu/commit/386bba2368842fc74388a3c1651c6c0c0c70adbd
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VFP comparison insns to decodetree

Convert the VFP comparison instructions to decodetree.

Note that comparison instructions should not honour the VFP
short-vector length and stride information: they are scalar-only
operations.  This applies to all the 2-operand instructions except
for VMOV, VABS, VNEG and VSQRT.  (In the old decoder this is
implemented via the "if (op == 15 && rn > 3) { veclen = 0; }" check.)

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: b623d803dda805f07aadcbf098961fde27315c19
      
https://github.com/qemu/qemu/commit/b623d803dda805f07aadcbf098961fde27315c19
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert the VCVT-from-f16 insns to decodetree

Convert the VCVTT, VCVTB instructions that deal with conversion
from half-precision floats to f32 or 64 to decodetree.

Since we're no longer constrained to the old decoder's style
using cpu_F0s and cpu_F0d we can perform a direct 16 bit
load of the right half of the input single-precision register
rather than loading the full 32 bits and then doing a
separate shift or sign-extension.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: cdfd14e86ab0b1ca29a702d13a8e4af2e902a9bf
      
https://github.com/qemu/qemu/commit/cdfd14e86ab0b1ca29a702d13a8e4af2e902a9bf
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert the VCVT-to-f16 insns to decodetree

Convert the VCVTT and VCVTB instructions which convert from
f32 and f64 to f16 to decodetree.

Since we're no longer constrained to the old decoder's style
using cpu_F0s and cpu_F0d we can perform a direct 16 bit
store of the right half of the input single-precision register
rather than doing a load/modify/store sequence on the full
32 bits.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: e25155f55dc4abb427a88dfe58bbbc550fe7d643
      
https://github.com/qemu/qemu/commit/e25155f55dc4abb427a88dfe58bbbc550fe7d643
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VFP round insns to decodetree

Convert the VFP round-to-integer instructions VRINTR, VRINTZ and
VRINTX to decodetree.

These instructions were only introduced as part of the "VFP misc"
additions in v8A, so we check this. The old decoder's implementation
was incorrectly providing them even for v7A CPUs.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 6ed7e49c3693ed8411773c4880f42b2932beb12d
      
https://github.com/qemu/qemu/commit/6ed7e49c3693ed8411773c4880f42b2932beb12d
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert double-single precision conversion insns to decodetree

Convert the VCVT double/single precision conversion insns to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 8fc9d8918cde342c71923e361b9f2193e36ed18b
      
https://github.com/qemu/qemu/commit/8fc9d8918cde342c71923e361b9f2193e36ed18b
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert integer-to-float insns to decodetree

Convert the VCVT integer-to-float instructions to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 92073e947487e2109f3dfebfeaa48d6323cbd981
      
https://github.com/qemu/qemu/commit/92073e947487e2109f3dfebfeaa48d6323cbd981
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VJCVT to decodetree

Convert the VJCVT instruction to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: e3d6f4290c788e850c64815f0b3e331600a4bcc0
      
https://github.com/qemu/qemu/commit/e3d6f4290c788e850c64815f0b3e331600a4bcc0
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert VCVT fp/fixed-point conversion insns to decodetree

Convert the VCVT (between floating-point and fixed-point) instructions
to decodetree.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 3111bfc2da6ba0c8396dc97ca479942d711c6146
      
https://github.com/qemu/qemu/commit/3111bfc2da6ba0c8396dc97ca479942d711c6146
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/vfp.decode

  Log Message:
  -----------
  target/arm: Convert float-to-integer VCVT insns to decodetree

Convert the float-to-integer VCVT instructions to decodetree.
Since these are the last unconverted instructions, we can
delete the old decoder structure entirely now.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 18cf951af9a27ae573a6fa17f9d0c103f7b7679b
      
https://github.com/qemu/qemu/commit/18cf951af9a27ae573a6fa17f9d0c103f7b7679b
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M target/arm/translate-vfp.inc.c

  Log Message:
  -----------
  target/arm: Fix short-vector increment behaviour

For VFP short vectors, the VFP registers are divided into a
series of banks: for single-precision these are s0-s7, s8-s15,
s16-s23 and s24-s31; for double-precision they are d0-d3,
d4-d7, ... d28-d31. Some banks are "scalar" meaning that
use of a register within them triggers a pure-scalar or
mixed vector-scalar operation rather than a full vector
operation. The scalar banks are s0-s7, d0-d3 and d16-d19.
When using a bank as part of a vector operation, we
iterate through it, increasing the register number by
the specified stride each time, and wrapping around to
the beginning of the bank.

Unfortunately our calculation of the "increment" part of this
was incorrect:
 vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask)
will only do the intended thing if bank_mask has exactly
one set high bit. For instance for doubles (bank_mask = 0xc),
if we start with vd = 6 and delta_d = 2 then vd is updated
to 12 rather than the intended 4.

This only causes problems in the unlikely case that the
starting register is not the first in its bank: if the
register number doesn't have to wrap around then the
expression happens to give the right answer.

Fix this bug by abstracting out the "check whether register
is in a scalar bank" and "advance register within bank"
operations to utility functions which use the right
bit masking operations.

Signed-off-by: Peter Maydell <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 650a379d505bf558bcb41124bc6c951a76cbc113
      
https://github.com/qemu/qemu/commit/650a379d505bf558bcb41124bc6c951a76cbc113
  Author: Peter Maydell <address@hidden>
  Date:   2019-06-13 (Thu, 13 Jun 2019)

  Changed paths:
    M hw/arm/smmuv3.c
    M hw/core/bus.c
    M scripts/decodetree.py
    M target/arm/Makefile.objs
    M target/arm/cpu.c
    M target/arm/cpu.h
    M target/arm/helper.c
    M target/arm/pauth_helper.c
    M target/arm/translate-a64.c
    M target/arm/translate-a64.h
    A target/arm/translate-vfp.inc.c
    M target/arm/translate.c
    M target/arm/translate.h
    A target/arm/vfp-uncond.decode
    A target/arm/vfp.decode
    M tests/tcg/aarch64/Makefile.target
    A tests/tcg/aarch64/pauth-2.c

  Log Message:
  -----------
  Merge remote-tracking branch 
'remotes/pmaydell/tags/pull-target-arm-20190613-1' into staging

target-arm queue:
 * convert aarch32 VFP decoder to decodetree
   (includes tightening up decode in a few places)
 * fix minor bugs in VFP short-vector handling
 * hw/core/bus.c: Only the main system bus can have no parent
 * smmuv3: Fix decoding of ID register range
 * Implement NSACR gating of floating point
 * Use tcg_gen_gvec_bitsel

# gpg: Signature made Thu 13 Jun 2019 15:15:39 BST
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "address@hidden"
# gpg: Good signature from "Peter Maydell <address@hidden>" [ultimate]
# gpg:                 aka "Peter Maydell <address@hidden>" [ultimate]
# gpg:                 aka "Peter Maydell <address@hidden>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20190613-1: (47 commits)
  target/arm: Fix short-vector increment behaviour
  target/arm: Convert float-to-integer VCVT insns to decodetree
  target/arm: Convert VCVT fp/fixed-point conversion insns to decodetree
  target/arm: Convert VJCVT to decodetree
  target/arm: Convert integer-to-float insns to decodetree
  target/arm: Convert double-single precision conversion insns to decodetree
  target/arm: Convert VFP round insns to decodetree
  target/arm: Convert the VCVT-to-f16 insns to decodetree
  target/arm: Convert the VCVT-from-f16 insns to decodetree
  target/arm: Convert VFP comparison insns to decodetree
  target/arm: Convert VMOV (register) to decodetree
  target/arm: Convert VSQRT to decodetree
  target/arm: Convert VNEG to decodetree
  target/arm: Convert VABS to decodetree
  target/arm: Convert VMOV (imm) to decodetree
  target/arm: Convert VFP fused multiply-add insns to decodetree
  target/arm: Convert VDIV to decodetree
  target/arm: Convert VSUB to decodetree
  target/arm: Convert VADD to decodetree
  target/arm: Convert VNMUL to decodetree
  ...

Signed-off-by: Peter Maydell <address@hidden>


Compare: https://github.com/qemu/qemu/compare/785a602eae7a...650a379d505b



reply via email to

[Prev in Thread] Current Thread [Next in Thread]