bug-gperf
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gperf] shrinking the asso_values array


From: Bruno Haible
Subject: Re: [bug-gperf] shrinking the asso_values array
Date: Thu, 05 Sep 2019 02:13:18 +0200
User-agent: KMail/5.1.3 (Linux/4.4.0-159-generic; KDE/5.18.0; x86_64; ; )

Hi Mark,

> FOURTH SUGGESTION ( MEDIUM DIFFICULTY )
> 
>   I was wondering if it would be worthwhile to use an "if" statement
> and a shorter static array asso_values in the "hash" function in some
> cases. The reason would be that the array is 256 bytes and loading
> different parts of the array could use up to 4 cache lines. Loading
> caches lines is very expensive from the point of view of the CPU
> (sometimes 100 cycles) as opposed to an if statement (which is a lot less,
> but is blocking.
> 
>   I looked at permut2.gperf in the examples directory of the gperf
> distribution. I will talk about this and next suggestion via examples
> related to that file.
> 
>   In our case, the largest hash value is 5 and all but four entries in
> the array are 6.
> Recall that in the perumt2.gperf example, only the letters x, y and z are 
> used.
> 
>   Instead of
> 
>   return len + asso_values[(unsigned char) str[1] + 1] +
> assoc_values[(unsigned char)[str[0]];
> 
> in the hash function, there could be an optimized(?) version of the following
> 
>   unsigned int returned_value = 6;
>   if ( len <= 2 && 'x' <= str[1] && str[1] <= 'z' &&  'x' <= str[0] &&
> str[0] <= 'z' )
>  {
>     unsigned char first = (unsigned char) str[1] - 'x' + 1;
>     unsigned char second = (unsigned char) str[0] - 'x';
>     returned_value len + modified_asso_values[first] +
> modified_asso_values[second];
>   }
>   return  returned_value;

On today's CPUs, with their long pipelines of prepared instructions, the
condition you are writing there would cost 5 conditional jumps. Or, when
we leave out the redundant 'len <= 2' test (already done by the caller)
and assume an optimizing compiler that optimizes
  'x' <= str[1] && str[1] <= 'z'
to
  (unsigned int)(str[1] - 'x') <= 'z' - 'x'
we still have 2 conditional jumps.

> If either letter is out of range, the array is not accessed at all.

Optimizing cache behaviour at the expense of conditional jumps is not
good. You need a reasonable balance between both.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]