[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Optimizing jit_divi_u()
From: |
Paul Cercueil |
Subject: |
Re: Optimizing jit_divi_u() |
Date: |
Fri, 25 Aug 2023 11:02:21 +0200 |
Hi Paulo,
I saw you added jit_hmulr(), thanks!
Le lundi 21 août 2023 à 11:03 -0300, Paulo César Pereira de Andrade a
écrit :
> Em sex., 18 de ago. de 2023 às 12:07, Paulo César Pereira de Andrade
> <paulo.cesar.pereira.de.andrade@gmail.com> escreveu:
> >
> > Em sex., 18 de ago. de 2023 às 09:41, Paul Cercueil
> > <paul@crapouillou.net> escreveu:
> > >
> > > Hi Paulo,
> >
> > Hi Paul,
> >
> > > I'm implementing an algorithm in my JIT to handle divisions by
> > > known
> > > immediate values:
> > > https://ridiculousfish.com/blog/posts/labor-of-division-episode-i.html
> > >
> > > This could totally be a default implementation for jit_divi_u(),
> > > by the
> > > way.
>
> We should create a single pass for this kind of optimization.
> Currently there are a few architecture based ones to convert
> division
> into shift and some other minor optimizations.
I believe we could implement it as a fallback, and also handle special
values in there. Then have all architectures that can't do better use
this fallback.
Note that signed division should be possible as well, I know that
"libdivide" supports it.
Cheers,
-Paul
>
> In gcc the computation of the constant is done in
> gcc/expmed.c*:choose_multiplier()
>
> > > It does require a "64 <- 32 * 32" multiplication and needs the
> > > high 32
> > > bits. Since we already have jit_mulr() to retrieve the low bits,
> > > I
> > > wonder if it would be a good idea to have a jit_mulhr() to only
> > > get the
> > > high bits.
>
> For the x86 port it is trivial, and most other cpus have at least
> an
> unsigned variant. The ports that do not have any "mulh" variant are
> not widely used and a fallback can be added for completeness of
> the Lightning instruction set.
>
> Thanks!
> Paulo