F16C only consists of two instructions, which are a bit peculiar
nevertheless.
First, they access only the low half of an YMM or XMM register for the
packed-half operand; the exact size still depends on the VEX.L flag.
This is similar to the existing avx_movx flag, but not exactly because
avx_movx is hardcoded to affect operand 2. To this end I added a "ph"
format name; it's possible to reuse this approach for the VPMOVSX and
VPMOVZX instructions, though that would also require adding two more
formats for the low-quarter and low-eighth of an operand.
Second, VCVTPS2PH is somewhat weird because it*stores* the result of
the instruction into memory rather than loading it.
Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
---
target/i386/cpu.c | 5 ++---
target/i386/cpu.h | 3 +++
target/i386/ops_sse.h | 29 +++++++++++++++++++++++++++++
target/i386/ops_sse_header.h | 6 ++++++
target/i386/tcg/decode-new.c.inc | 8 ++++++++
target/i386/tcg/decode-new.h | 2 ++
target/i386/tcg/emit.c.inc | 17 ++++++++++++++++-
tests/tcg/i386/test-avx.c | 17 +++++++++++++++++
tests/tcg/i386/test-avx.py | 8 ++++++--
9 files changed, 89 insertions(+), 6 deletions(-)