qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] AVX support for TCG


From: Richard Henderson
Subject: Re: [Qemu-devel] AVX support for TCG
Date: Sun, 30 Dec 2018 07:24:23 +1100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1

On 12/29/18 12:43 AM, Nick Renieris wrote:
>>> Do you think this could work as a GSoC project? I'm potentially
>>> interested in working on it this summer.
> 
>> Could be.  My first guess is something like 4 months work for this.
> 
> Four months full-time? If so I would say it's not viable for a GSoC
> project (it's 3 months), I've done the 12-hours-a-day-crunch thing for
> a week or so in GSoC 2017 and it was _not_ fun.
> Also, I hope you meant four months for me, not for you - I'm
> completely new to the QEMU codebase. I expect it will take me weeks
> just to understand x86's 'translate.c' (who thought it'd be a good
> idea to put all this stuff in _one_ file?).

I did have a beginner in mind when guessing 4 months.  Don't take that as a
fully speced out answer, but it may well be that full avx2 support cannot be
done within the 3 months of gsoc.  I would certainly expect avx512 to take even
longer.

> Another question, are there existing discussions about this
> refactoring effort or specifically AVX? I asked a similar question on
> IRC a few days ago and got no answers.

Not that I recall.  I have some code at

https://github.com/rth7680/qemu/commits/i386-avx

that attempts to remove the sse_op_table(s).  However, it also splits up the
sse operations into units of uint64_t.  Which seemed sort of reasonable at the
time, considering that a lot of sse is 2*mmx.

But in the intervening 2.5 years since I worked on that branch, we have learned
that calls to helpers dominate.  It's better to have a single call that does 4x
the work than 4 separate calls.

The tcg-op-gvec.h infrastructure allows for the different modes that avx+mmx
allows:

(1) 64-bit operations,
(2) 128-bit operations, modifying only the low 128 bits,
(3) 128-bit operations, zeroing bits beyond the first 128,
(4) N*128-bit operations, zeroing bits beyond the first N*128.

so we should not need a great proliferation of helper functions, merely a
re-organization of what we have now.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]