qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v2] contrib/plugins: control flow plugin (WIP!)


From: Alex Bennée
Subject: Re: [RFC PATCH v2] contrib/plugins: control flow plugin (WIP!)
Date: Thu, 14 Mar 2024 12:02:00 +0000
User-agent: mu4e 1.12.1; emacs 29.2

Alex Bennée <alex.bennee@linaro.org> writes:

> This is a simple control flow tracking plugin that uses the latest
> inline and conditional operations to detect and track control flow
> changes. It is currently an exercise at seeing how useful the changes
> are.
>
> Based-on: <20240312075428.244210-1-pierrick.bouvier@linaro.org>
> Cc: Gustavo Romero <gustavo.romero@linaro.org>
> Cc: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Message-Id: <20240311153432.1395190-1-alex.bennee@linaro.org>
>
> ---
> v2
>   - only need a single call back
>   - drop need for INSN_WIDTH
>   - still don't understand the early exits
>
> I'm still seeing weirdness in the generated code, for example the
> plugin reports "early exits" which doesn't make sense for a ret which
> terminates a block:
>
>   addr: 0x403c88 hexchar: ret  (1/1)
>     early exits 1280
>     branches 1280
>       to 0x403d00 (639)
>       to 0x403d24 (639)
>
<snip>
> +
> +/*
> + * At the start of each block we need to resolve two things:
> + *
> + *  - is last_pc == block_end, if not we had an early exit
> + *  - is start of block last_pc + insn width, if not we jumped
> + *
> + * Once those are dealt with we can instrument the rest of the
> + * instructions for their execution.
> + *
> + */
> +static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
> +{
> +    uint64_t pc = qemu_plugin_tb_vaddr(tb);
> +    size_t insns = qemu_plugin_tb_n_insns(tb);
> +    struct qemu_plugin_insn *last_insn = qemu_plugin_tb_get_insn(tb, insns - 
> 1);
> +
> +    /*
> +     * check if we are executing linearly after the last block. We can
> +     * handle both early block exits and normal branches in the
> +     * callback if we hit it.
> +     */
> +    gpointer udata = GUINT_TO_POINTER(pc);
> +    qemu_plugin_register_vcpu_tb_exec_cond_cb(
> +        tb, vcpu_tb_branched_exec, QEMU_PLUGIN_CB_NO_REGS,
> +        QEMU_PLUGIN_COND_NE, pc_after_block, pc, udata);
> +
> +    /*
> +     * Now we can set start/end for this block so the next block can
> +     * check where we are at.
> +     */
> +    qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
> +                                                      
> QEMU_PLUGIN_INLINE_STORE_U64,
> +                                                      end_block, 
> qemu_plugin_insn_vaddr(last_insn));
> +    qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
> +                                                      
> QEMU_PLUGIN_INLINE_STORE_U64,
> +                                                      pc_after_block,
> +                                                      
> qemu_plugin_insn_vaddr(last_insn) +
> +
> qemu_plugin_insn_size(last_insn));

With the following:

modified   contrib/plugins/cflow.c
@@ -220,7 +220,7 @@ static void vcpu_tb_branched_exec(unsigned int cpu_index, 
void *udata)
     g_mutex_lock(&node->lock);
 
     if (early_exit) {
-        fprintf(stderr, "%s: pc=%"PRIx64", epbc=%"PRIx64"
+        fprintf(stderr, "%s: pc=%"PRIx64", epbc=%"PRIx64
                 " npc=%"PRIx64", lpc=%"PRIx64", \n",
                 __func__, pc, ebpc, npc, lpc);
         node->early_exit++;
@@ -264,6 +264,7 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct 
qemu_plugin_tb *tb)
 {
     uint64_t pc = qemu_plugin_tb_vaddr(tb);
     size_t insns = qemu_plugin_tb_n_insns(tb);
+    struct qemu_plugin_insn *first_insn = qemu_plugin_tb_get_insn(tb, 0);
     struct qemu_plugin_insn *last_insn = qemu_plugin_tb_get_insn(tb, insns - 
1);
 
     /*
@@ -278,12 +279,13 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct 
qemu_plugin_tb *tb)
 
     /*
      * Now we can set start/end for this block so the next block can
-     * check where we are at.
+     * check where we are at. Do this on the first instruction and not
+     * the TB so we don't get mixed up with above.
      */
-    qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
+    qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(first_insn,
                                                       
QEMU_PLUGIN_INLINE_STORE_U64,
                                                       end_block, 
qemu_plugin_insn_vaddr(last_insn));
-    qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
+    qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(first_insn,
                                                       
QEMU_PLUGIN_INLINE_STORE_U64,
                                                       pc_after_block,
                                                       
qemu_plugin_insn_vaddr(last_insn) +

The report looks more sane:

  collected 9013 control flow nodes in the hash table
    addr: 0x403c88 hexchar: ret  (0/1)
      branches 1280      
        to 0x403d00 (639)            
        to 0x403d24 (639)

So I think we need to think about preserving the ordering of
instrumentation (at least from the same plugin) so we are not laying any
API bear traps for users.

I assume because loads and stores are involved we won't see the
optimiser trying to swap stuff around.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



reply via email to

[Prev in Thread] Current Thread [Next in Thread]