[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH v2] contrib/plugins: control flow plugin (WIP!)
From: |
Alex Bennée |
Subject: |
Re: [RFC PATCH v2] contrib/plugins: control flow plugin (WIP!) |
Date: |
Thu, 14 Mar 2024 12:02:00 +0000 |
User-agent: |
mu4e 1.12.1; emacs 29.2 |
Alex Bennée <alex.bennee@linaro.org> writes:
> This is a simple control flow tracking plugin that uses the latest
> inline and conditional operations to detect and track control flow
> changes. It is currently an exercise at seeing how useful the changes
> are.
>
> Based-on: <20240312075428.244210-1-pierrick.bouvier@linaro.org>
> Cc: Gustavo Romero <gustavo.romero@linaro.org>
> Cc: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Message-Id: <20240311153432.1395190-1-alex.bennee@linaro.org>
>
> ---
> v2
> - only need a single call back
> - drop need for INSN_WIDTH
> - still don't understand the early exits
>
> I'm still seeing weirdness in the generated code, for example the
> plugin reports "early exits" which doesn't make sense for a ret which
> terminates a block:
>
> addr: 0x403c88 hexchar: ret (1/1)
> early exits 1280
> branches 1280
> to 0x403d00 (639)
> to 0x403d24 (639)
>
<snip>
> +
> +/*
> + * At the start of each block we need to resolve two things:
> + *
> + * - is last_pc == block_end, if not we had an early exit
> + * - is start of block last_pc + insn width, if not we jumped
> + *
> + * Once those are dealt with we can instrument the rest of the
> + * instructions for their execution.
> + *
> + */
> +static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
> +{
> + uint64_t pc = qemu_plugin_tb_vaddr(tb);
> + size_t insns = qemu_plugin_tb_n_insns(tb);
> + struct qemu_plugin_insn *last_insn = qemu_plugin_tb_get_insn(tb, insns -
> 1);
> +
> + /*
> + * check if we are executing linearly after the last block. We can
> + * handle both early block exits and normal branches in the
> + * callback if we hit it.
> + */
> + gpointer udata = GUINT_TO_POINTER(pc);
> + qemu_plugin_register_vcpu_tb_exec_cond_cb(
> + tb, vcpu_tb_branched_exec, QEMU_PLUGIN_CB_NO_REGS,
> + QEMU_PLUGIN_COND_NE, pc_after_block, pc, udata);
> +
> + /*
> + * Now we can set start/end for this block so the next block can
> + * check where we are at.
> + */
> + qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
> +
> QEMU_PLUGIN_INLINE_STORE_U64,
> + end_block,
> qemu_plugin_insn_vaddr(last_insn));
> + qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
> +
> QEMU_PLUGIN_INLINE_STORE_U64,
> + pc_after_block,
> +
> qemu_plugin_insn_vaddr(last_insn) +
> +
> qemu_plugin_insn_size(last_insn));
With the following:
modified contrib/plugins/cflow.c
@@ -220,7 +220,7 @@ static void vcpu_tb_branched_exec(unsigned int cpu_index,
void *udata)
g_mutex_lock(&node->lock);
if (early_exit) {
- fprintf(stderr, "%s: pc=%"PRIx64", epbc=%"PRIx64"
+ fprintf(stderr, "%s: pc=%"PRIx64", epbc=%"PRIx64
" npc=%"PRIx64", lpc=%"PRIx64", \n",
__func__, pc, ebpc, npc, lpc);
node->early_exit++;
@@ -264,6 +264,7 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct
qemu_plugin_tb *tb)
{
uint64_t pc = qemu_plugin_tb_vaddr(tb);
size_t insns = qemu_plugin_tb_n_insns(tb);
+ struct qemu_plugin_insn *first_insn = qemu_plugin_tb_get_insn(tb, 0);
struct qemu_plugin_insn *last_insn = qemu_plugin_tb_get_insn(tb, insns -
1);
/*
@@ -278,12 +279,13 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct
qemu_plugin_tb *tb)
/*
* Now we can set start/end for this block so the next block can
- * check where we are at.
+ * check where we are at. Do this on the first instruction and not
+ * the TB so we don't get mixed up with above.
*/
- qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
+ qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(first_insn,
QEMU_PLUGIN_INLINE_STORE_U64,
end_block,
qemu_plugin_insn_vaddr(last_insn));
- qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
+ qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(first_insn,
QEMU_PLUGIN_INLINE_STORE_U64,
pc_after_block,
qemu_plugin_insn_vaddr(last_insn) +
The report looks more sane:
collected 9013 control flow nodes in the hash table
addr: 0x403c88 hexchar: ret (0/1)
branches 1280
to 0x403d00 (639)
to 0x403d24 (639)
So I think we need to think about preserving the ordering of
instrumentation (at least from the same plugin) so we are not laying any
API bear traps for users.
I assume because loads and stores are involved we won't see the
optimiser trying to swap stuff around.
--
Alex Bennée
Virtualisation Tech Lead @ Linaro