[qemu-s390x] [PULL 05/33] s390x/tcg: Implement VECTOR GATHER ELEMENT

qemu-s390x

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[qemu-s390x] [PULL 05/33] s390x/tcg: Implement VECTOR GATHER ELEMENT

From:	Cornelia Huck
Subject:	[qemu-s390x] [PULL 05/33] s390x/tcg: Implement VECTOR GATHER ELEMENT
Date:	Mon, 11 Mar 2019 10:02:54 +0100

From: David Hildenbrand <address@hidden>

Let's start with a more involved one, but it is the first in the list
of vector support instructions (introduced with the vector facility).

Good thing is, we need a lot of basic infrastructure for this. Reading
and writing vector elements as well as checking element validity.

All vector instruction related translation functions will reside in
translate_vx.inc.c, to be included in translate.c - similar to how
other architectures handle it.

While at it, directly add some documentation (which contains parts about
things added in follow-up patches, but splitting this up does not make
too much sense). Also add ES_* defines heavily used later.

Reviewed-by: Richard Henderson <address@hidden>
Signed-off-by: David Hildenbrand <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Cornelia Huck <address@hidden>
---
 target/s390x/insn-data.def      |   6 ++
 target/s390x/translate.c        |   2 +
 target/s390x/translate_vx.inc.c | 135 ++++++++++++++++++++++++++++++++
 3 files changed, 143 insertions(+)
 create mode 100644 target/s390x/translate_vx.inc.c

diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index 61b750a855e9..7d128ac9d61d 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -972,6 +972,12 @@
     D(0xb93e, KIMD,    RRE,   MSA,  0, 0, 0, 0, msa, 0, S390_FEAT_TYPE_KIMD)
     D(0xb93f, KLMD,    RRE,   MSA,  0, 0, 0, 0, msa, 0, S390_FEAT_TYPE_KLMD)
 
+/* === Vector Support Instructions === */
+
+/* VECTOR GATHER ELEMENT */
+    E(0xe713, VGEF,    VRV,   V,   la2, 0, 0, 0, vge, 0, ES_32, IF_VEC)
+    E(0xe712, VGEG,    VRV,   V,   la2, 0, 0, 0, vge, 0, ES_64, IF_VEC)
+
 #ifndef CONFIG_USER_ONLY
 /* COMPARE AND SWAP AND PURGE */
     E(0xb250, CSP,     RRE,   Z,   r1_32u, ra2, r1_P, 0, csp, 0, MO_TEUL, 
IF_PRIV)
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index d52c02c572bb..a1c6698dea9c 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -5120,6 +5120,8 @@ static DisasJumpType op_mpcifc(DisasContext *s, DisasOps 
*o)
 }
 #endif
 
+#include "translate_vx.inc.c"
+
 /* ====================================================================== */
 /* The "Cc OUTput" generators.  Given the generated output (and in some cases
    the original inputs), update the various cc data structures in order to
diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c
new file mode 100644
index 000000000000..9864ec5134de
--- /dev/null
+++ b/target/s390x/translate_vx.inc.c
@@ -0,0 +1,135 @@
+/*
+ * QEMU TCG support -- s390x vector instruction translation functions
+ *
+ * Copyright (C) 2019 Red Hat Inc
+ *
+ * Authors:
+ *   David Hildenbrand <address@hidden>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+/*
+ * For most instructions that use the same element size for reads and
+ * writes, we can use real gvec vector expansion, which potantially uses
+ * real host vector instructions. As they only work up to 64 bit elements,
+ * 128 bit elements (vector is a single element) have to be handled
+ * differently. Operations that are too complicated to encode via TCG ops
+ * are handled via gvec ool (out-of-line) handlers.
+ *
+ * As soon as instructions use different element sizes for reads and writes
+ * or access elements "out of their element scope" we expand them manually
+ * in fancy loops, as gvec expansion does not deal with actual element
+ * numbers and does also not support access to other elements.
+ *
+ * 128 bit elements:
+ *  As we only have i32/i64, such elements have to be loaded into two
+ *  i64 values and can then be processed e.g. by tcg_gen_add2_i64.
+ *
+ * Sizes:
+ *  On s390x, the operand size (oprsz) and the maximum size (maxsz) are
+ *  always 16 (128 bit). What gvec code calls "vece", s390x calls "es",
+ *  a.k.a. "element size". These values nicely map to MO_8 ... MO_64. Only
+ *  128 bit element size has to be treated in a special way (MO_64 + 1).
+ *  We will use ES_* instead of MO_* for this reason in this file.
+ *
+ * CC handling:
+ *  As gvec ool-helpers can currently not return values (besides via
+ *  pointers like vectors or cpu_env), whenever we have to set the CC and
+ *  can't conclude the value from the result vector, we will directly
+ *  set it in "env->cc_op" and mark it as static via set_cc_static()".
+ *  Whenever this is done, the helper writes globals (cc_op).
+ */
+
+#define NUM_VEC_ELEMENT_BYTES(es) (1 << (es))
+#define NUM_VEC_ELEMENTS(es) (16 / NUM_VEC_ELEMENT_BYTES(es))
+
+#define ES_8    MO_8
+#define ES_16   MO_16
+#define ES_32   MO_32
+#define ES_64   MO_64
+#define ES_128  4
+
+static inline bool valid_vec_element(uint8_t enr, TCGMemOp es)
+{
+    return !(enr & ~(NUM_VEC_ELEMENTS(es) - 1));
+}
+
+static void read_vec_element_i64(TCGv_i64 dst, uint8_t reg, uint8_t enr,
+                                 TCGMemOp memop)
+{
+    const int offs = vec_reg_offset(reg, enr, memop & MO_SIZE);
+
+    switch (memop) {
+    case ES_8:
+        tcg_gen_ld8u_i64(dst, cpu_env, offs);
+        break;
+    case ES_16:
+        tcg_gen_ld16u_i64(dst, cpu_env, offs);
+        break;
+    case ES_32:
+        tcg_gen_ld32u_i64(dst, cpu_env, offs);
+        break;
+    case ES_8 | MO_SIGN:
+        tcg_gen_ld8s_i64(dst, cpu_env, offs);
+        break;
+    case ES_16 | MO_SIGN:
+        tcg_gen_ld16s_i64(dst, cpu_env, offs);
+        break;
+    case ES_32 | MO_SIGN:
+        tcg_gen_ld32s_i64(dst, cpu_env, offs);
+        break;
+    case ES_64:
+    case ES_64 | MO_SIGN:
+        tcg_gen_ld_i64(dst, cpu_env, offs);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static void write_vec_element_i64(TCGv_i64 src, int reg, uint8_t enr,
+                                  TCGMemOp memop)
+{
+    const int offs = vec_reg_offset(reg, enr, memop & MO_SIZE);
+
+    switch (memop) {
+    case ES_8:
+        tcg_gen_st8_i64(src, cpu_env, offs);
+        break;
+    case ES_16:
+        tcg_gen_st16_i64(src, cpu_env, offs);
+        break;
+    case ES_32:
+        tcg_gen_st32_i64(src, cpu_env, offs);
+        break;
+    case ES_64:
+        tcg_gen_st_i64(src, cpu_env, offs);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static DisasJumpType op_vge(DisasContext *s, DisasOps *o)
+{
+    const uint8_t es = s->insn->data;
+    const uint8_t enr = get_field(s->fields, m3);
+    TCGv_i64 tmp;
+
+    if (!valid_vec_element(enr, es)) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }
+
+    tmp = tcg_temp_new_i64();
+    read_vec_element_i64(tmp, get_field(s->fields, v2), enr, es);
+    tcg_gen_add_i64(o->addr1, o->addr1, tmp);
+    gen_addi_and_wrap_i64(s, o->addr1, o->addr1, 0);
+
+    tcg_gen_qemu_ld_i64(tmp, o->addr1, get_mem_index(s), MO_TE | es);
+    write_vec_element_i64(tmp, get_field(s->fields, v1), enr, es);
+    tcg_temp_free_i64(tmp);
+    return DISAS_NEXT;
+}
-- 
2.17.2

[Prev in Thread]

Current Thread

[Next in Thread]

[qemu-s390x] [PULL 00/33] final s390x patches for 4.0 soft freeze, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 01/33] target/s390x: Remove non-architected entries from struct LowCore, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 02/33] s390x/tcg: Define vector instruction formats, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 03/33] s390x/tcg: Check vector register instructions at central point, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 06/33] s390x/tcg: Implement VECTOR GENERATE BYTE MASK, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 04/33] s390x/tcg: Utilities for vector instruction helpers, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 05/33] s390x/tcg: Implement VECTOR GATHER ELEMENT, Cornelia Huck <=
- [qemu-s390x] [PULL 08/33] s390x/tcg: Implement VECTOR LOAD, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 07/33] s390x/tcg: Implement VECTOR GENERATE MASK, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 10/33] s390x/tcg: Implement VECTOR LOAD ELEMENT, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 09/33] s390x/tcg: Implement VECTOR LOAD AND REPLICATE, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 12/33] s390x/tcg: Implement VECTOR LOAD GR FROM VR ELEMENT, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 11/33] s390x/tcg: Implement VECTOR LOAD ELEMENT IMMEDIATE, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 13/33] s390x/tcg: Implement VECTOR LOAD LOGICAL ELEMENT AND ZERO, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 14/33] s390x/tcg: Implement VECTOR LOAD MULTIPLE, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 17/33] s390x/tcg: Implement VECTOR LOAD VR FROM GRS DISJOINT, Cornelia Huck, 2019/03/11
- [qemu-s390x] [PULL 15/33] s390x/tcg: Implement VECTOR LOAD TO BLOCK BOUNDARY, Cornelia Huck, 2019/03/11

Prev by Date: [qemu-s390x] [PULL 04/33] s390x/tcg: Utilities for vector instruction helpers
Next by Date: [qemu-s390x] [PULL 08/33] s390x/tcg: Implement VECTOR LOAD
Previous by thread: [qemu-s390x] [PULL 04/33] s390x/tcg: Utilities for vector instruction helpers
Next by thread: [qemu-s390x] [PULL 08/33] s390x/tcg: Implement VECTOR LOAD
Index(es):
- Date
- Thread