qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: R: R: About hardfloat in ppc


From: BALATON Zoltan
Subject: Re: R: R: About hardfloat in ppc
Date: Thu, 30 Apr 2020 17:35:55 +0200 (CEST)
User-agent: Alpine 2.22 (BSF 395 2020-01-19)

On Thu, 30 Apr 2020, 罗勇刚(Yonggang Luo) wrote:
I propose a new way to computing the float flags,
We preserve a  float computing cash
typedef struct FpRecord {
 uint8_t op;
 float32 A;
 float32 B;
}  FpRecord;
FpRecord fp_cache[1024];
int fp_cache_length;
uint32_t fp_exceptions;

1. For each new fp operation we push it to the  fp_cache,
2. Once we read the fp_exceptions , then we re-compute
the fp_exceptions by re-running the fp FpRecord sequence.
and clear  fp_cache_length.
3. If we clear the fp_exceptions , then we set fp_cache_length to 0 and
clear  fp_exceptions.
4. If the  fp_cache are full, then we re-compute
the fp_exceptions by re-running the fp FpRecord sequence.

Would this be a general method to use hard-float?
The consued time should be  2*hard_float.
Considerating read fp_exceptions are rare, then the amortized time
complexity
would be 1 * hard_float.

It's hard to guess what the hit rate of such cache would be and if it's low then managing the cache is probably more expensive than running with softfloat. So to evaluate any proposed patch we also need some benchmarks which we can experiment with to tell if the results are good or not otherwise we're just guessing. Are there some existing tests and benchmarks that we can use? Alex mentioned fp-bench I think and to evaluate the correctness of the FP implementation I've seen this other conversation:

https://lists.nongnu.org/archive/html/qemu-devel/2020-04/msg05107.html
https://lists.nongnu.org/archive/html/qemu-devel/2020-04/msg05126.html

Is that something we can use for PPC as well to check the correctness?

So I think before implementing any potential solution that came up in this brainstorming the first step would be to get and compile (or write if not available) some tests and benchmarks:

1. testing host behaviour for inexact and compare that for different archs
2. some FP tests that can be used to compare results with QEMU and real CPU to check correctness of emulation (if these check for inexact differences then could be used instead of 1.) 3. some benchmarks to evaluate QEMU performance (these could be same as FP tests or some real world FP heavy applications).

Then we can see if the proposed solution is faster and still correct.

Regards,
BALATON Zoltan

reply via email to

[Prev in Thread] Current Thread [Next in Thread]