qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About hardfloat in ppc


From: Yonggang Luo
Subject: Re: About hardfloat in ppc
Date: Fri, 1 May 2020 10:21:10 +0800

That's what I suggested,
We preserve a  float computing cache
typedef struct FpRecord {
  uint8_t op;
  float32 A;
  float32 B;
}  FpRecord;
FpRecord fp_cache[1024];
int fp_cache_length;
uint32_t fp_exceptions;

1. For each new fp operation we push it to the  fp_cache,
2. Once we read the fp_exceptions , then we re-compute
the fp_exceptions by re-running the fp FpRecord sequence.
 and clear  fp_cache_length.
3. If we clear the fp_exceptions , then we set fp_cache_length to 0 and
 clear  fp_exceptions.
4. If the  fp_cache are full, then we re-compute
the fp_exceptions by re-running the fp FpRecord sequence.

Now the keypoint is how to tracking the read and write of FPSCR register,
The current code are
    cpu_fpscr = tcg_global_mem_new(cpu_env,
                                   offsetof(CPUPPCState, fpscr), "fpscr");

On Fri, May 1, 2020 at 9:59 AM Programmingkid <address@hidden> wrote:

> On Apr 30, 2020, at 12:34 PM, Dino Papararo <address@hidden> wrote:
>
> Maybe the fastest way to implement hardfloats for ppc could be run them by default and until some fpu instruction request for FPSCR register.
> At this time probably we want to check for some exception.. so QEMU could come back to last fpu instruction executed and re-execute it in softfloat taking care this time of FPSCR flags, then continue in hardfloats unitl another instruction looking for FPSCR register and so on..
>
> Dino

That sounds like a good idea.

> -----Messaggio originale-----
> Da: BALATON Zoltan <address@hidden>
> Inviato: giovedì 30 aprile 2020 17:36
> A: 罗勇刚(Yonggang Luo) <address@hidden>
> Cc: Richard Henderson <address@hidden>; Dino Papararo <address@hidden>; address@hidden; Programmingkid <address@hidden>; address@hidden; Howard Spoelstra <address@hidden>; Alex Bennée <address@hidden>
> Oggetto: Re: R: R: About hardfloat in ppc
>
> On Thu, 30 Apr 2020, 罗勇刚(Yonggang Luo) wrote:
>> I propose a new way to computing the float flags, We preserve a  float
>> computing cash typedef struct FpRecord {  uint8_t op;
>> float32 A;
>> float32 B;
>> }  FpRecord;
>> FpRecord fp_cache[1024];
>> int fp_cache_length;
>> uint32_t fp_exceptions;
>>
>> 1. For each new fp operation we push it to the  fp_cache, 2. Once we
>> read the fp_exceptions , then we re-compute the fp_exceptions by
>> re-running the fp FpRecord sequence.
>> and clear  fp_cache_length.
>> 3. If we clear the fp_exceptions , then we set fp_cache_length to 0
>> and clear  fp_exceptions.
>> 4. If the  fp_cache are full, then we re-compute the fp_exceptions by
>> re-running the fp FpRecord sequence.
>>
>> Would this be a general method to use hard-float?
>> The consued time should be  2*hard_float.
>> Considerating read fp_exceptions are rare, then the amortized time
>> complexity would be 1 * hard_float.
>
> It's hard to guess what the hit rate of such cache would be and if it's low then managing the cache is probably more expensive than running with softfloat. So to evaluate any proposed patch we also need some benchmarks which we can experiment with to tell if the results are good or not otherwise we're just guessing. Are there some existing tests and benchmarks that we can use? Alex mentioned fp-bench I think and to evaluate the correctness of the FP implementation I've seen this other
> conversation:
>
> https://lists.nongnu.org/archive/html/qemu-devel/2020-04/msg05107.html
> https://lists.nongnu.org/archive/html/qemu-devel/2020-04/msg05126.html
>
> Is that something we can use for PPC as well to check the correctness?
>
> So I think before implementing any potential solution that came up in this brainstorming the first step would be to get and compile (or write if not
> available) some tests and benchmarks:
>
> 1. testing host behaviour for inexact and compare that for different archs 2. some FP tests that can be used to compare results with QEMU and real CPU to check correctness of emulation (if these check for inexact differences then could be used instead of 1.) 3. some benchmarks to evaluate QEMU performance (these could be same as FP tests or some real world FP heavy applications).
>
> Then we can see if the proposed solution is faster and still correct.
>
> Regards,
> BALATON Zoltan



--
         此致

罗勇刚
Yours
    sincerely,
Yonggang Luo

reply via email to

[Prev in Thread] Current Thread [Next in Thread]