qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fwd: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC


From: G 3
Subject: Fwd: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC
Date: Wed, 4 Mar 2020 13:43:41 -0500



---------- Forwarded message ---------
From: G 3 <address@hidden>
Date: Wed, Mar 4, 2020 at 1:35 PM
Subject: Re: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC
To: BALATON Zoltan <address@hidden>




On Mon, Mar 2, 2020 at 6:16 PM BALATON Zoltan <address@hidden> wrote:
On Mon, 2 Mar 2020, Richard Henderson wrote:
> On 3/2/20 3:42 AM, BALATON Zoltan wrote:
>>> The "hardfloat" option works (with other targets) only with ieee745
>>> accumulative exceptions, when the most common of those exceptions, inexact, has
>>> already been raised.  And thus need not be raised a second time.
>>
>> Why exactly it's done that way? What are the differences between IEEE FP
>> implementations that prevents using hardfloat most of the time instead of only
>> using it in some (although supposedly common) special cases?
>
> While it is possible to read the host's ieee exception word after the hardfloat
> operation, there are two reasons that is undesirable:
>
> (1) It is *slow*.  So slow that it's faster to run the softfloat code instead.
> I thought it would be easier to find the benchmark numbers that Emilio
> generated when this was tested, but I can't find it.

I remember those benchmarks too and this is also what the paper Alex
referred to also confirmed. Also I've found that enabling hardfloat for
PPC without doing anything else is slightly slower (on a recent CPU, on
older CPUs could be even slower). Interetingly however it does give a
speedup for vector instructions (maybe because they don't clear flags
between each sub operation). Does that mean these vector instruction
helpers are also buggy regarding exceptions?

I am all intrigued by these vector instructions. Apple was really big on using them back in the day so programs like Quicktime and iTunes definitely use them. I'm not sure if the PowerPC's altivec vector instructions map to host vector instructions already, but if they don't, mapping them would give us a huge speedup in certain places. Would anyone know if this was already done in QEMU?

reply via email to

[Prev in Thread] Current Thread [Next in Thread]