qemu-riscv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 03/14] tcg/riscv: Add basic support for vector


From: Richard Henderson
Subject: Re: [PATCH v2 03/14] tcg/riscv: Add basic support for vector
Date: Mon, 2 Sep 2024 10:28:04 +1000
User-agent: Mozilla Thunderbird

On 8/30/24 16:15, LIU Zhiwei wrote:
From: Swung0x48 <swung0x48@outlook.com>

The RISC-V vector instruction set utilizes the LMUL field to group
multiple registers, enabling variable-length vector registers. This
implementation uses only the first register number of each group while
reserving the other register numbers within the group.

In TCG, each VEC_IR can have 3 types (TCG_TYPE_V64/128/256), and the
host runtime needs to adjust LMUL based on the type to use different
register groups.

This presents challenges for TCG's register allocation. Currently, we
avoid modifying the register allocation part of TCG and only expose the
minimum number of vector registers.

For example, when the host vlen is 64 bits and type is TCG_TYPE_V256, with
LMUL equal to 4, we use 4 vector registers as one register group. We can
use a maximum of 8 register groups, but the V0 register number is reserved
as a mask register, so we can effectively use at most 7 register groups.
Moreover, when type is smaller than TCG_TYPE_V256, only 7 registers are
forced to be used. This is because TCG cannot yet dynamically constrain
registers with type; likewise, when the host vlen is 128 bits and
TCG_TYPE_V256, we can use at most 15 registers.

There is not much pressure on vector register allocation in TCG now, so
using 7 registers is feasible and will not have a major impact on code
generation.

This patch:
1. Reserves vector register 0 for use as a mask register.
2. When using register groups, reserves the additional registers within
    each group.

Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
---
  tcg/riscv/tcg-target-con-str.h |   1 +
  tcg/riscv/tcg-target.c.inc     | 131 +++++++++++++++++++++++++--------
  tcg/riscv/tcg-target.h         |  78 +++++++++++---------
  tcg/riscv/tcg-target.opc.h     |  12 +++
  4 files changed, 157 insertions(+), 65 deletions(-)
  create mode 100644 tcg/riscv/tcg-target.opc.h

diff --git a/tcg/riscv/tcg-target-con-str.h b/tcg/riscv/tcg-target-con-str.h
index d5c419dff1..21c4a0a0e0 100644
--- a/tcg/riscv/tcg-target-con-str.h
+++ b/tcg/riscv/tcg-target-con-str.h
@@ -9,6 +9,7 @@
   * REGS(letter, register_mask)
   */
  REGS('r', ALL_GENERAL_REGS)
+REGS('v', GET_VREG_SET(riscv_vlen))

Perhaps too complicated. Make this MAKE_64BIT_MASK(32, 32); everything else will be handled by tcg_target_available_regs[] and reserved_regs.

@@ -127,6 +113,12 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind 
kind, int slot)
  #define TCG_CT_CONST_J12  0x1000
#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
+#define ALL_VECTOR_REGS    MAKE_64BIT_MASK(33, 31)
+#define ALL_DVECTOR_REG_GROUPS 0x5555555400000000
+#define ALL_QVECTOR_REG_GROUPS 0x1111111000000000

V0 is still a vector register, even if it is reserved.
I think you should include it in these masks.

+#define GET_VREG_SET(vlen) (vlen == 64 ? ALL_QVECTOR_REG_GROUPS : \
+                             (vlen == 128 ? ALL_DVECTOR_REG_GROUPS : \
+                              ALL_VECTOR_REGS))

I think you will not need this macro.

+/*
+ * RISC-V vector instruction emitters
+ */
+
+/* Vector registers uses the same 5 lower bits as GPR registers. */
+static void tcg_out_opc_reg_vec(TCGContext *s, RISCVInsn opc,
+                                TCGReg d, TCGReg s1, TCGReg s2, bool vm)
+{
+    tcg_out32(s, encode_r(opc, d, s1, s2) | (vm << 25));
+}
+
+static void tcg_out_opc_reg_vec_i(TCGContext *s, RISCVInsn opc,
+                                  TCGReg rd, TCGArg imm, TCGReg vs2, bool vm)
+{
+    tcg_out32(s, encode_r(opc, rd, (imm & 0x1f), vs2) | (vm << 25));

I think you want to create new encode_* functions, not abuse the integer ones.

+}
+
+/* vm=0 (vm = false) means vector masking ENABLED. */
+#define tcg_out_opc_vv(s, opc, vd, vs2, vs1, vm) \
+    tcg_out_opc_reg_vec(s, opc, vd, vs1, vs2, vm);
+
+/*
+ * In RISC-V, vs2 is the first operand, while rs1/imm is the
+ * second operand.
+ */
+#define tcg_out_opc_vx(s, opc, vd, vs2, rs1, vm) \
+    tcg_out_opc_reg_vec(s, opc, vd, rs1, vs2, vm);
+
+#define tcg_out_opc_vi(s, opc, vd, vs2, imm, vm) \
+    tcg_out_opc_reg_vec_i(s, opc, vd, imm, vs2, vm);
+
+/*
+ * Only unit-stride addressing implemented; may extend in future.
+ */
+#define tcg_out_opc_ldst_vec(s, opc, vs3_vd, rs1, vm) \
+    tcg_out_opc_reg_vec(s, opc, vs3_vd, rs1, 0, vm);

I don't understand the need for any of these #defines.
Why are we not simply creating functions of the correct name?

@@ -2101,6 +2160,13 @@ static void tcg_target_init(TCGContext *s)
      tcg_target_available_regs[TCG_TYPE_I32] = 0xffffffff;
      tcg_target_available_regs[TCG_TYPE_I64] = 0xffffffff;
+ if (cpuinfo & CPUINFO_ZVE64X) {
+        TCGRegSet vector_regs = GET_VREG_SET(riscv_vlen);

This ought to be the only usage of GET_VREG_SET, so I think you should inline that code with a switch on riscv_vlen/vlenb.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]