qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC]: port of embedded x86-mini disassembler to QEMU


From: Michael Clark
Subject: Re: [RFC]: port of embedded x86-mini disassembler to QEMU
Date: Sat, 11 Jan 2025 02:03:16 +1300
User-agent: Mozilla Thunderbird

On 1/11/25 00:07, Paolo Bonzini wrote:
Il ven 10 gen 2025, 10:52 Michael Clark <michael@anarch128.org> ha scritto:

a note to announce a port of the x86-mini disassembler to QEMU.

- https://github.com/michaeljclark/qemu/tree/x86-mini

I assume the huge .h files are autogenerated? If so, QEMU cannot use them
without including the human-readable sources in the tree.

yes indeed. there is an x86_tablegen.py python script in the other repo but it is not in the current patch. it would be somewhat easy to read the tables from CSV files directly into arrays at the expense of several more milliseconds during startup. the revised operand formats maps relatively strictly to enum definitions with string tables in the source so a reader in C would not be impossible. it needs a Lisp interpreter so that it could unexec the compiled tables into a translation cache :-).

I can see how that might be interesting for x86 virtualization where you
have only one target and therefore you can get rid of the capstone
dependency. At the same time, other virtualization targets like arm64 and
RISC-V are going to become more and more important—not less—and not having
to maintain a disassembler ourselves as part of QEMU is also a big plus...

yes indeed. but in an ideal world the encoders and decoders are matched pairs. I would like to work on a translator or interpreter that uses the same codec as the disassembler that it uses. although I guess there is a point to having different codecs and disassemblers for differential fuzzing. but there is also a point to having a matched encoder and decoder. it just seems to me that if you write a machine lowering rule you should write a machine lifting rule.

anyway, in fact it is just yet another disassembler at this point, but the codec emitter works. it doesn't yet have an arch-neutral TCG-like API and IR to drive it yet. but if I wrote an interpreter it would be the exact same data structures. now I need to write an assembler. e.g.

  // VMOVD xmm,xmm/m32 [rm: evex.128.66.0f.w0 6e /r]
  // asm: vmovd xmm31,DWORD PTR [r14+r13*8-8]

  x86_buffer buf;
  x86_codec codec;

  x86_buffer_init_ex(&buf, mem_addr, 0, mem_length);
  memset(&codec, 0, sizeof(codec));

  codec.opc[0] = 0x6e;
  codec.opclen = 1;
  codec.flags |= x86_cf_amd64;
  codec.flags |= x86_cf_modrm;
  codec.flags |= x86_ce_evex;
  codec.evex = x86_enc_evex(
    x86_map_0f, x86_pfx_66, x86_vex_l128, x86_vex_w0,
    /*r*/ x86_xmm31, /*x*/ x86_r13, /*b*/ x86_r14, /*v*/ 0,
    /*k*/ 0, /*brd*/ 0, /*z*/ 0
  );
  codec.modrm = x86_enc_modrm(x86_mod_disp8, x86_xmm31, x86_rm_sp_sib);
  codec.sib = x86_enc_sib(x86_scale_8, x86_r13, x86_r14);
  codec.disp32 = -2;

  x86_codec_write(&buf, codec, &nbytes)

the trunk branch in the repo has examples for a disassembler with the LLVM C-API and C++API. it would be a tiny piece of glue to link in an LLVM based disassembler too. but this code is faster than LLVM. :-D

Michael

Paolo


- https://github.com/michaeljclark/x86/tree/x86-mini

# x86-mini

the x86-mini library is a lightweight x86 encoder, decoder, and
disassembler that uses extensions to the Intel instruction set
metadata format to encode modern VEX/EVEX instructions and legacy
instructions using a parameterized LEX (legacy extension) format.

- metadata-driven disassembler with Intel format output.
- written in C11 for compatibility with projects written in C.
- low-level instruction encoder and decoder uses <= 32-bytes.
- python tablegen program to generate C tables from CSV metadata.
- metadata table tool to inspect operand encode and decode tables.
- carefully checked machine-readable instruction set metadata.
- support for REX/VEX/EVEX and preliminary support for REX2.

the x86-mini x86 encoder and decoder library has been written from
scratch to be modern and as simple as possible while also covering
recent additions to the Intel and AMD 64-bit instruction sets such
as the EVEX encodings for recent AVX-512 extensions and soon REX2/
EVEX encodings for Intel APX, as it is written with that in mind.

## interest to the QEMU community

- x86-mini is fast. raw decode performance is ~100-200MiB/sec.
- x86-mini is small. 5 files, ~5 KLOC or ~13 KLOC including tables.
- x86-mini is complete and includes the latest AVX-512 extensions.
- x86-mini is easy to extend and uses extended Intel format metadata.
- x86-mini is documented with detailed info on the metadata format.
- x86-mini has CLI tools for searching x86 instruction set metadata.

## techinical notes

- the decoder is table-based and uses a metadata interpreter.
- the decode table is ~66KiB with a ~150KiB acceleration trie.
- there are currently 3658 opcode entries active on x86-64
   which expands to 4775 table entries due to parameterization.
- it could be made faster by vectorizing the prefix decoder and
   generating decode templates from the metadata to consteval
   metadata interpretation to eliminate some L1 D$ traffic.

after cherry-picking the commit, one can test host and target
disassembly support. e.g. for an x86-64 target on an x86-64 host:

$ echo aaa | qemu-x86_64 -d in_asm,out_asm /usr/bin/openssl sha256

## caveats and limitations

- supports 32-bit and 64-bit disassembly, and theoretically 16-bit.
- designed to support 16-bit but base index formats are not done yet.
- x86-64 is exhaustively fuzz-tested against the LLVM disassembler.
- but x86-mini is new and hasn't been battle-tested in production.

if you already link with capstone then it doesn't provide very many
immediate benefits, however, I think it is potentially useful as a
small embeddable disassembler to evaluate for potential inclusion.

## rationale

I worked on the QEMU disassembler while working on the QEMU RISC-V
target back in 2017/2018 and I was curious about vector support.
it seemed at the time that TCG vector support was piecemeal, plus
the old x86 disassembler seemed messy and incomplete. I also needed
an MIT-licensed disassembler to enable use in a commercial product.
basically, I was looking for a lightweight symmetric x86 instruction
encoder and decoder library in pure C with simple build requirements.
that is what prompted this initiative.

it would be nice to have an x86 disassembler building out-of-the-box
as I find QEMU's built-in tracing extremely useful and given x86 is
a popular target, a small embedded disassembler might be practical.

## summary and conclusion

at minimum, the metedata may be useful for x86 EVEX support. note
I see `tests/tcg/i386/x86.csv` in the source tree. the metadata is
also based on x86-csv but has had numerous inaccuracies fixed as
well as conversion of legacy instructions to the new LEX format.
in effect the metadata has been fuzz-tested against LLVM for x86-64
and ISA coverage is in the order of ~99.7%. the main branch of the
linked repo has a procedural fuzzer for metadata-based instruction
synthesis that could be useful for generating test cases for QEMU.

I am kind of throwing this over the fence, although the code is quite
self-contained and my stress and mental health is now under control.
also I have not yet run checkpatch.pl on this code. it is a preview.

x86-mini submaintainer.
Michael Clark.
--







reply via email to

[Prev in Thread] Current Thread [Next in Thread]