Re: [PATCH v3 06/14] tcg/riscv: Implement vector mov/dup{m/i}

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 06/14] tcg/riscv: Implement vector mov/dup{m/i}

From:	LIU Zhiwei
Subject:	Re: [PATCH v3 06/14] tcg/riscv: Implement vector mov/dup{m/i}
Date:	Tue, 10 Sep 2024 09:13:52 +0800
User-agent:	Mozilla Thunderbird


On 2024/9/5 14:56, Richard Henderson wrote:

On 9/4/24 07:27, LIU Zhiwei wrote:

@@ -698,6 +704,21 @@ static bool tcg_out_mov(TCGContext *s, TCGTypetype, TCGReg ret, TCGReg arg)

      case TCG_TYPE_I64:
          tcg_out_opc_imm(s, OPC_ADDI, ret, arg, 0);
          break;
+    case TCG_TYPE_V64:
+    case TCG_TYPE_V128:
+    case TCG_TYPE_V256:
+        {
+            int nf = get_vec_type_bytes(type) / riscv_vlenb;
+
+            if (nf != 0) {
+                tcg_debug_assert(is_power_of_2(nf) && nf <= 8);
+                tcg_out_opc_vi(s, OPC_VMVNR_V, ret, arg, nf - 1, true);
+            } else {
+                riscv_set_vec_config_vl(s, type);

+ tcg_out_opc_vv(s, OPC_VMV_V_V, ret, TCG_REG_V0, arg,true);

+            }
+        }
+        break;


Perhaps

        int lmul = type - riscv_lg2_vlenb;
        int nf = 1 << MIN(lmul, 0);
        tcg_out_opc_vi(s, OPC_VMVNR_V, ret, arg, nf - 1);

Is there a reason to prefer vmv.v.v over vmvnr.v?

I think it's a trade-off. For some CPUs, instruction will be splitinternally. Thus the less the fraction lmul is, the less micro ops forexecution.That's the benefit of using vmv.v.v. But here we also need a vsetivli.On some cpus, it can be fusion-ed to the next instruction.

Seems like we can always move one vector reg...

OK. I will take this way.

+static void tcg_out_dupi_vec(TCGContext *s, TCGType type, unsignedvece,

+                                    TCGReg dst, int64_t arg)
+{
+    if (arg < 16 && arg >= -16) {
+        riscv_set_vec_config_vl_vece(s, type, vece);
+        tcg_out_opc_vi(s, OPC_VMV_V_I, dst, TCG_REG_V0, arg, true);
+        return;
+    }
+    tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP0, arg);
+    tcg_out_dup_vec(s, type, vece, dst, TCG_REG_TMP0);
+}

I'll note that 0 and -1 do not require SEW change. I don't know howoften that will come up

On our test on OpenCV, we get a rate of 99.7%. Thus we will optimizethis next version.


Thanks,
Zhiwei

, since in my testing with aarch64, we usually needed to swap toTCG_TYPE_V256 anyway.
r~

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH v3 04/14] tcg/riscv: Add riscv vset{i}vli support, (continued)
- [PATCH v3 04/14] tcg/riscv: Add riscv vset{i}vli support, LIU Zhiwei, 2024/09/04
  - Re: [PATCH v3 04/14] tcg/riscv: Add riscv vset{i}vli support, Richard Henderson, 2024/09/05
    - Re: [PATCH v3 04/14] tcg/riscv: Add riscv vset{i}vli support, LIU Zhiwei, 2024/09/09
    - Re: [PATCH v3 04/14] tcg/riscv: Add riscv vset{i}vli support, Richard Henderson, 2024/09/10
    - Re: [PATCH v3 04/14] tcg/riscv: Add riscv vset{i}vli support, LIU Zhiwei, 2024/09/10
- [PATCH v3 05/14] tcg/riscv: Implement vector load/store, LIU Zhiwei, 2024/09/04
  - Re: [PATCH v3 05/14] tcg/riscv: Implement vector load/store, Richard Henderson, 2024/09/05
    - Re: [PATCH v3 05/14] tcg/riscv: Implement vector load/store, LIU Zhiwei, 2024/09/09
- [PATCH v3 06/14] tcg/riscv: Implement vector mov/dup{m/i}, LIU Zhiwei, 2024/09/04
  - Re: [PATCH v3 06/14] tcg/riscv: Implement vector mov/dup{m/i}, Richard Henderson, 2024/09/05
    - Re: [PATCH v3 06/14] tcg/riscv: Implement vector mov/dup{m/i}, LIU Zhiwei <=
- [PATCH v3 07/14] tcg/riscv: Add support for basic vector opcodes, LIU Zhiwei, 2024/09/04
  - Re: [PATCH v3 07/14] tcg/riscv: Add support for basic vector opcodes, Richard Henderson, 2024/09/05
- [PATCH v3 08/14] tcg/riscv: Implement vector cmp ops, LIU Zhiwei, 2024/09/04
  - Re: [PATCH v3 08/14] tcg/riscv: Implement vector cmp ops, Richard Henderson, 2024/09/05
    - Re: [PATCH v3 08/14] tcg/riscv: Implement vector cmp ops, LIU Zhiwei, 2024/09/09
- [PATCH v3 09/14] tcg/riscv: Implement vector neg ops, LIU Zhiwei, 2024/09/04
- [PATCH v3 10/14] tcg/riscv: Implement vector sat/mul ops, LIU Zhiwei, 2024/09/04
- [PATCH v3 11/14] tcg/riscv: Implement vector min/max ops, LIU Zhiwei, 2024/09/04
- [PATCH v3 12/14] tcg/riscv: Implement vector shs/v ops, LIU Zhiwei, 2024/09/04
- [PATCH v3 13/14] tcg/riscv: Implement vector roti/v/x shi ops, LIU Zhiwei, 2024/09/04

Prev by Date: Re: [PATCH v2 06/10 3/4] target/s390x: Use deposit to set ilen in save_link_info
Next by Date: Re: [PATCH v3 08/14] tcg/riscv: Implement vector cmp ops
Previous by thread: Re: [PATCH v3 06/14] tcg/riscv: Implement vector mov/dup{m/i}
Next by thread: [PATCH v3 07/14] tcg/riscv: Add support for basic vector opcodes
Index(es):
- Date
- Thread