qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/5] plugins: add new inline op STORE_U64


From: Richard Henderson
Subject: Re: [PATCH v2 2/5] plugins: add new inline op STORE_U64
Date: Tue, 12 Mar 2024 11:15:08 -1000
User-agent: Mozilla Thunderbird

On 3/11/24 21:54, Pierrick Bouvier wrote:
+static void gen_empty_inline_cb_store_u64(void)
+{
+    TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
+    TCGv_ptr cpu_index_as_ptr = tcg_temp_ebb_new_ptr();
+    TCGv_i64 val = tcg_temp_ebb_new_i64();
+    TCGv_ptr ptr = tcg_temp_ebb_new_ptr();
+
+    tcg_gen_ld_i32(cpu_index, tcg_env,
+                   -offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));
+    /* second operand will be replaced by immediate value */
+    tcg_gen_mul_i32(cpu_index, cpu_index, cpu_index);
+    tcg_gen_ext_i32_ptr(cpu_index_as_ptr, cpu_index);
+    tcg_gen_movi_ptr(ptr, 0);
+    tcg_gen_add_ptr(ptr, ptr, cpu_index_as_ptr);
+
+    tcg_gen_movi_i64(val, 0);
+    tcg_gen_st_i64(val, ptr, 0);
+
+    tcg_temp_free_ptr(ptr);
+    tcg_temp_free_i64(val);
+    tcg_temp_free_ptr(cpu_index_as_ptr);
+    tcg_temp_free_i32(cpu_index);
+}

I was never fond of this full pattern generate...

@@ -352,6 +385,20 @@ static TCGOp *copy_st_i64(TCGOp **begin_op, TCGOp *op)
      return op;
  }
+static TCGOp *copy_mov_i64(TCGOp **begin_op, TCGOp *op, uint64_t v)
+{
+    if (TCG_TARGET_REG_BITS == 32) {
+        op = copy_op(begin_op, op, INDEX_op_mov_i32);
+        op->args[1] = tcgv_i32_arg(TCGV_LOW(tcg_constant_i64(v)));
+        op = copy_op(begin_op, op, INDEX_op_mov_i32);
+        op->args[1] = tcgv_i32_arg(TCGV_HIGH(tcg_constant_i64(v)));
+    } else {
+        op = copy_op(begin_op, op, INDEX_op_mov_i64);
+        op->args[1] = tcgv_i64_arg(tcg_constant_i64(v));
+    }
+    return op;
+}

... followed by pattern match and modify. I think adding more of this is fragile, and a mistake.

(1) This encodes knowledge of the order of the parts of a mov_i64 for 32-bit 
host.
(2) You shouldn't use TCG_LOW/HIGH of tcg_constant_i64, but two separate calls to tcg_constant_i32 with the parts of @v.

But all of this would be easier if we simply generate new code now, instead of 
copy.

+static TCGOp *append_inline_cb_store_u64(const struct qemu_plugin_dyn_cb *cb,
+                                       TCGOp *begin_op, TCGOp *op,
+                                       int *unused)
+{
+    char *ptr = cb->inline_insn.entry.score->data->data;
+    size_t elem_size = g_array_get_element_size(
+        cb->inline_insn.entry.score->data);
+    size_t offset = cb->inline_insn.entry.offset;
+    op = copy_ld_i32(&begin_op, op);
+    op = copy_mul_i32(&begin_op, op, elem_size);
+    op = copy_ext_i32_ptr(&begin_op, op);
+    op = copy_const_ptr(&begin_op, op, ptr + offset);
+    op = copy_add_ptr(&begin_op, op);
+    op = copy_mov_i64(&begin_op, op, cb->inline_insn.imm);
+    op = copy_st_i64(&begin_op, op);

You'd also be able to fold offset into the store. This would allow the scoreboard address to be entered once into the constant pool and have multiple uses.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]