On 1/11/25 00:07, Paolo Bonzini wrote:
> Il ven 10 gen 2025, 10:52 Michael Clark <michael@anarch128.org> ha scritto:
>
>> a note to announce a port of the x86-mini disassembler to QEMU.
>>
>> - https://github.com/michaeljclark/qemu/tree/x86-mini
>
> I assume the huge .h files are autogenerated? If so, QEMU cannot use them
> without including the human-readable sources in the tree.
yes indeed. there is an x86_tablegen.py python script in the other repo
but it is not in the current patch. it would be somewhat easy to read
the tables from CSV files directly into arrays at the expense of several
more milliseconds during startup. the revised operand formats maps
relatively strictly to enum definitions with string tables in the source
so a reader in C would not be impossible
Building the tables at compile time is fine, only leaving out the script is not.
> I can see how that might be interesting for x86 virtualization where you
> have only one target and therefore you can get rid of the capstone
> dependency. At the same time, other virtualization targets like arm64 and
> RISC-V are going to become more and more important—not less—and not having
> to maintain a disassembler ourselves as part of QEMU is also a big plus...
yes indeed. but in an ideal world the encoders and decoders are matched
pairs. I would like to work on a translator or interpreter that uses the
same codec as the disassembler
Ok, that makes sense. QEMU already has a decoder that is very table-based though the tables are hand written. I am not wed to it though—as long as the code generators remain more or less unmodified, I would love to only keep "these is how the operands are prepared for use in the IR emitters" and make the details of x86 decoding Someone Else's Problem. So if you can kill most (certainly not all) of the tables in target/i386/tcg/decode-new.c.inc that would be interesting.
(I am sure you'd find some underspecified and/or wrong parts of the x86 spec, too :) For example many VEX classes are bollocks, plus some more examples hinted at at the top of that file).
Paolo
anyway, in fact it is just yet another disassembler at this point, but
the codec emitter works. it doesn't yet have an arch-neutral TCG-like
API and IR to drive it yet.