|
From: | Huang Tao |
Subject: | Re: [PATCH] target/riscv: Implement dynamic establishment of custom decoder |
Date: | Fri, 8 Mar 2024 17:41:25 +0800 |
User-agent: | Mozilla Thunderbird |
Hello, Richard and Daniel,Thanks to your review, the suggestions about decoder_table_size and decoder's place will be adopted in the next version of the patch.
But I would not agree that this patch is a wash or worse. On average, though, the two approaches may be comparable. However, as more and more vendors join in, this approach will have scalability issues.
For example, if you add 10 vendors, it is not fair to treat the 10th vendor with the lowest performance. Our approach works for most scenarios, which are basic RV extensions + vendor-specific extensions.
Thanks, Huang Tao 在 2024/3/8 04:35, Richard Henderson 写道:
- for (size_t i = 0; i < ARRAY_SIZE(decoders); ++i) { - if (decoders[i].guard_func(ctx->cfg_ptr) && - decoders[i].decode_func(ctx, opcode32)) { + for (size_t i = 0; i < decoder_table_size; ++i) { + if (ctx->decoder[i](ctx, opcode32)) { return;By the way, you're adding layers of pointer chasing, so I suspect you'll find all of this is a wash or worse, performance-wise.Indeed, since some of the predicates are trivial, going the other way might help: allow everything to be inlined:if (decode_insn32(...)) { return; } if (has_xthead_p(...) && decode_xthead(...)) { return; } ...Even if there are 10 entries here, so what? All of the code has to be compiled into QEMU. You're not going to somehow add truly dynamic code that you've loaded from a module.You could perhaps streamline predicates such as has_xthead_p to not test 11 variables by adding an artificial "ext_xthead_any" configuration entry that is the sum of all of those.r~
[Prev in Thread] | Current Thread | [Next in Thread] |