grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail.


From: Daniel Kiper
Subject: Re: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail.
Date: Mon, 29 Jan 2018 17:42:26 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri, Jan 19, 2018 at 04:10:01PM -0500, Peter Jones wrote:
> On Thu, Jan 18, 2018 at 11:52:53PM +0100, Daniel Kiper wrote:
> > On Tue, Jan 16, 2018 at 01:16:17PM -0500, Peter Jones wrote:
> > > On my laptop running at 2.4GHz, if I run a VM where tsc calibration
> > > using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds
> > > to do so (as measured with the stopwatch on my phone), with a tsc delta
> > > of 0x1cd1c85300, or around 125 billion cycles.
> > >
> > > If instead of trying to wait for 5-200ms to show up on the pmtimer, we try
> > > to wait for 5-200us, it decides it's broken in ~0x7998f9e TSCs, aka ~2
> > > million cycles, or more or less instantly.
> > >
> > > Additionally, this reading the pmtimer was returning 0xffffffff anyway,
> > > and that's obviously an invalid return.  I've added a check for that and
> > > 0 so we don't bother waiting for the test if what we're seeing is dead
> > > pins with no response at all.
> > >
> > > Signed-off-by: Peter Jones <address@hidden>
> > > ---
> > >  grub-core/kern/i386/tsc_pmtimer.c | 43 
> > > ++++++++++++++++++++++++++++++---------
> > >  1 file changed, 33 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/grub-core/kern/i386/tsc_pmtimer.c 
> > > b/grub-core/kern/i386/tsc_pmtimer.c
> > > index c9c36169978..609402b8376 100644
> > > --- a/grub-core/kern/i386/tsc_pmtimer.c
> > > +++ b/grub-core/kern/i386/tsc_pmtimer.c
> > > @@ -38,30 +38,53 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer,
> > >    grub_uint64_t start_tsc;
> > >    grub_uint64_t end_tsc;
> > >    int num_iter = 0;
> > > +  int bad_reads = 0;
> > >
> > > -  start = grub_inl (pmtimer) & 0xffffff;
> > > +  start = grub_inl (pmtimer) & 0x3fff;
> >
> > I am not sure why you are changing this to 0x3fff...
>
> Upon re-reading, I am not either.  (As you see I wrote this patch 2+
> months ago while working on something else, so the memory has already
> faded.)

Yeah, I know how it works...

> Obviously I need to make a version 2 of this patch to fix some issues.

Great!

> > >    last = start;
> > >    end = start + num_pm_ticks;
> > >    start_tsc = grub_get_tsc ();
> > >    while (1)
> > >      {
> > > -      cur = grub_inl (pmtimer) & 0xffffff;
> >
> > What about 24-bit timers? I would leave this here...
>
> I was blisfully unaware of 24-bit timers.  I'll put that back and add a
> comment for future readers of the code, and check for 0xffffff below.

Thanks!

> > > +      cur = grub_inl (pmtimer);
> > > +
> > > +      /* If we get 10 reads in a row that are obviously dead pins, 
> > > there's no
> > > +  reason to do this thousands of times.
> > > +       */
> > > +      if (cur == 0xffffffff || cur == 0)
> >
> > ...and here I would check for 0xffffff and 0.
>
> Sure.
>
> >
> > > + {
> > > +   bad_reads++;
> > > +   grub_dprintf ("pmtimer", "cur: 0x%08x bad_reads: %d\n", cur, 
> > > bad_reads);
> > > +
> > > +   if (bad_reads == 10)
> > > +     return 0;
> > > + }
> > > +      else if (bad_reads)
> > > + bad_reads = 0;
> >
> > Do we really need to reset this?
>
> I mean, it's technically correct, but I don't think you're going to see
> 0 and 0xffffff as valid values often enough to actually care.
>
> > > +      cur &= 0x3fff;
> > > +
> > >        if (cur < last)
> > > - cur |= 0x1000000;
> > > + cur |= 0x4000;
> > >        num_iter++;
>
> You didn't spot this, but it's wrong for exactly the same reason the
> initialization of start is.  I'm pretty sure when I did this part, I was
> letting the comment below tell me what was going on, and completely
> missed that the comment doesn't match the conditional at all.
>
> > >        if (cur >= end)
> > >   {
> > >     end_tsc = grub_get_tsc ();
> > > +   grub_dprintf ("pmtimer", "tsc delta is 0x%016lx\n",
> > > +                 end_tsc - start_tsc);
> > >     return end_tsc - start_tsc;
> > >   }
> > > -      /* Check for broken PM timer.
> > > -  50000000 TSCs is between 5 ms (10GHz) and 200 ms (250 MHz)
> > > -  if after this time we still don't have 1 ms on pmtimer, then
> > > -  pmtimer is broken.
> > > +      /* Check for broken PM timer.  5000 TSCs is between 5us (10GHz) and
> >                                                              ^^^ 500ns?
> >
> > > +  200us (250 MHz).  If after this time we still don't have 1us on
> >          ^^^^^ 20us?
>
> You know, I stared at this for a surprising amount of time when writing
> it trying to figure out why these numbers seemed weird.  While I have
> the diff in front of me rather than the code as it is now, I see that
> it's because I was starting with the comment that was previously there,
> which had 50M instead of 5M for the TSC count.  So I wound up dividing
> these by 10k instead of 1k.  Woops.  Will fix in v2.
>
> >
> > > +  pmtimer, then pmtimer is broken.
> > >         */
> > > -      if ((num_iter & 0xffffff) == 0 && grub_get_tsc () - start_tsc > 
> > > 5000000) {
> > > - return 0;
> > > -      }
> > > +      end_tsc = grub_get_tsc();
> > > +      if ((num_iter & 0x3fff) == 0 && end_tsc - start_tsc > 5000)
> >
> > Why 0x3fff here which means, AIUI, 4000 iterations?
>
> Er... 16384, surely?  But basically I just wanted to scale down the

Yep, or I simply forgot 0x before 4000...

> limit by approximately the same order as I scaled down the TSC delta,
> and I wanted to leave the code basically working the same way (not for
> any good reason), so I still wanted a number with only low-order bits
> set.  TSC is scaled down by 1000, so I scaled the PM tick iteration down
> by 1024.  I could make it twice as wide of a window instead, if you're
> worried about that particular race, but I don't think it's going to
> matter.
>
> If our loop takes 1 TSC tick per iteration, which is quite impluasibly
> fast, and we're on a 10GHz machine, then we should normally see pmtimer
> change by 3580 in ~2800 iterations.  If it takes 16k iterations (again,

I think that number of iterations is wrong here. Let's see...

f_pm = n_pm / t_pm, n_pm = 3580, f_pm = 3.58MHz => t_pm = 3580 / 3.58MHz
t_pm = 0,001s = 1ms => num_iter = 1ms * 10GHz = 10.000.000

Then below calculations are wrong too...

> with perfectly executing code that takes no time), the pmtimer is
> running at 625KHz instead of 3.58MHz.  If we're on a 250MHz machine,
> that number is ~70 iterations.  If we get to 16000 iterations without
> seeing pmtimer change by 3580 then pmtimer is either broken or we're on
> a 60GHz+ machine.
>
> That said the logic here did not match the comment and it still doesn't;
> it should be comparing cur and start, not the number of iterations.  On
> the next version of the patch I'm going to do something like:
>
>         if ((++num_iter > (grub_uint32_t)num_pm_ticks << 3UL) ||

Taking into account above I am not sure why you multiply by 8 here...

>             (cur == start && end_tsc - start_tsc > 5000))
>
> So it'll have a fairly widely ranged total limit but also actually check if
> pmtimer results have changed at all.
>
> Tentative patch below, but I haven't had a chance to test it yet:

In general I am not against the idea itself but I think that some
calculations should be fixed here and there.

> From 97e59bbe000961db5f823c4df074835d1165bd9d Mon Sep 17 00:00:00 2001
> From: Peter Jones <address@hidden>
> Date: Tue, 7 Nov 2017 17:12:17 -0500
> Subject: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail.
>
> On my laptop running at 2.4GHz, if I run a VM where tsc calibration
> using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds
> to do so (as measured with the stopwatch on my phone), with a tsc delta
> of 0x1cd1c85300, or around 125 billion cycles.
>
> If instead of trying to wait for 5-200ms to show up on the pmtimer, we try
> to wait for 500ns-20us, it decides it's broken in ~0x7998f9e TSCs, aka ~2
> million cycles, or more or less instantly.
>
> Additionally, this reading the pmtimer was returning 0xffffffff anyway,
> and that's obviously an invalid return.  I've added a check for that and
> 0 so we don't bother waiting for the test if what we're seeing is dead
> pins with no response at all.
> ---
>  grub-core/kern/i386/tsc_pmtimer.c | 56 
> ++++++++++++++++++++++++++++++++-------
>  1 file changed, 46 insertions(+), 10 deletions(-)
>
> diff --git a/grub-core/kern/i386/tsc_pmtimer.c 
> b/grub-core/kern/i386/tsc_pmtimer.c
> index c9c36169978..b09c00316df 100644
> --- a/grub-core/kern/i386/tsc_pmtimer.c
> +++ b/grub-core/kern/i386/tsc_pmtimer.c
> @@ -37,8 +37,14 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer,
>    grub_uint32_t cur, end;
>    grub_uint64_t start_tsc;
>    grub_uint64_t end_tsc;
> -  int num_iter = 0;
> +  unsigned int num_iter = 0;
> +  int bad_reads = 0;
>
> +  /* Some timers are 24-bit and some are 32-bit, but it doesn't make much

Please leave empty line after "/*".

> +   * difference to us.  Caring which one we have isn't really worth it since
> +   * the low-order digits will give us enough data to calibrate TSC.  So just
> +   * mask the top-order byte off.
> +   */
>    start = grub_inl (pmtimer) & 0xffffff;
>    last = start;
>    end = start + num_pm_ticks;
> @@ -46,22 +52,52 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer,
>    while (1)
>      {
>        cur = grub_inl (pmtimer) & 0xffffff;
> +
> +      /* If we get 10 reads in a row that are obviously dead pins, there's no
> +      reason to do this thousands of times.

Please be consistent with the comments format. I prefer one given above.

> +       */
> +      if (cur == 0xffffff || cur == 0)
> +     {
> +       bad_reads++;
> +       grub_dprintf ("pmtimer", "pmtimer value: 0x%08x bad_reads: %d\n",
> +                     cur, bad_reads);
> +
> +       if (bad_reads == 10)
> +         return 0;
> +     }
> +      else if (bad_reads)
> +     bad_reads = 0;
> +
> +      end_tsc = grub_get_tsc();
> +
>        if (cur < last)
>       cur |= 0x1000000;
> -      num_iter++;
>        if (cur >= end)
>       {
> -       end_tsc = grub_get_tsc ();
> +       grub_dprintf ("pmtimer", "tsc delta is 0x%016llx\n",
> +                     end_tsc - start_tsc);
>         return end_tsc - start_tsc;
>       }
> -      /* Check for broken PM timer.
> -      50000000 TSCs is between 5 ms (10GHz) and 200 ms (250 MHz)
> -      if after this time we still don't have 1 ms on pmtimer, then
> -      pmtimer is broken.
> +
> +      /* Check for broken PM timer.  5000 TSCs is between 500ns (10GHz) and
> +      20us (250 MHz).  If after this time we still don't have 1us on
> +      pmtimer, then pmtimer is broken.
> +
> +      Likewise, if our code is perfectly efficient and introduces no delays
> +      whatsoever, on a 10GHz system we should see a TSC delta of 3580 in
> +      ~2800 iterations.  On a 250MHz machine that should be ~70 iterations.

Incorrect calculations?

> +      So there's a hard limit here at 8x our desired pm timer delta, because
> +      if we get that far, we're either on a 60GHz+ machine or we're never
> +      hitting our threshold.

And please use same comment format as in first one.

Thanks,

Daniel



reply via email to

[Prev in Thread] Current Thread [Next in Thread]