bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: No declaration for --no-strlen


From: Behdad Esfahbod
Subject: Re: No declaration for --no-strlen
Date: Mon, 19 Jan 2009 13:53:54 -0500
User-agent: Thunderbird 2.0.0.18 (X11/20090107)

Thanks Bruno, as always!

Bruno Haible wrote:
> Hello Behdad,
> 
>> In my case, I have a table of all two-byte strings.  ...
>> So, in my case storing the length is useless 
>> because of the nature of the data.  Here is the file:
>>
>>   http://svn.gnome.org/viewvc/vte/trunk/src/vteseq-2.gperf?view=markup
> 
> Ah, I see. You are entirely right that the length being == 2 for all
> keywords, there is no point in including it in the hash code computation.
> I'm changing gperf so that it recognizes this situation automatically:
> 
> 
> 2009-01-19  Bruno Haible  <address@hidden>
> 
>       Don't include the length in the hash function if all keywords have the
>       same length.
>       * src/search.h (Search): Add _hash_includes_len field.
>       * src/search.cc (Search::prepare): Initialize it.
>       (Search::count_duplicates_tuple, Search::count_duplicates_multiset,
>       Search::prepare_asso_values, Search::find_asso_values,
>       Search::compute_hash): Use it instead of !option[NOLENGTH].
>       * src/output.h (Output): New field _hash_includes_len. Add it as
>       constructor argument.
>       * src/output.cc (Output::Output): Add hash_includes_len argument.
>       (Output::output_hash_function): Use _hash_includes_len instead of
>       !option[NOLENGTH].
>       * src/main.cc (main): Pass _hash_includes_len from Search to Output.
>       * tests/permut2.exp: Updated expected test result.
>       * tests/permut3.exp: Likewise.
>       * tests/permutc2.exp: Likewise.
>       Reported by Behdad Esfahbod <address@hidden>.
> 
> Index: src/main.cc
> ===================================================================
> RCS file: /cvsroot/gperf/gperf/src/main.cc,v
> retrieving revision 1.24
> diff -u -r1.24 main.cc
> --- src/main.cc       23 Aug 2008 18:52:48 -0000      1.24
> +++ src/main.cc       19 Jan 2009 09:48:41 -0000
> @@ -1,5 +1,5 @@
>  /* Driver program for the hash function generator
> -   Copyright (C) 1989-1998, 2000, 2002-2003 Free Software Foundation, Inc.
> +   Copyright (C) 1989-1998, 2000, 2002-2003, 2009 Free Software Foundation, 
> Inc.
>     Written by Douglas C. Schmidt <address@hidden>
>     and Bruno Haible <address@hidden>.
>  
> @@ -104,6 +104,7 @@
>                            searcher._total_keys,
>                            searcher._max_key_len,
>                            searcher._min_key_len,
> +                          searcher._hash_includes_len,
>                            searcher._key_positions,
>                            searcher._alpha_inc,
>                            searcher._total_duplicates,
> Index: src/output.cc
> ===================================================================
> RCS file: /cvsroot/gperf/gperf/src/output.cc,v
> retrieving revision 1.39
> diff -u -r1.39 output.cc
> --- src/output.cc     23 Aug 2008 18:52:48 -0000      1.39
> +++ src/output.cc     19 Jan 2009 09:48:42 -0000
> @@ -1,5 +1,5 @@
>  /* Output routines.
> -   Copyright (C) 1989-1998, 2000, 2002-2004, 2006-2007 Free Software 
> Foundation, Inc.
> +   Copyright (C) 1989-1998, 2000, 2002-2004, 2006-2007, 2009 Free Software 
> Foundation, Inc.
>     Written by Douglas C. Schmidt <address@hidden>
>     and Bruno Haible <address@hidden>.
>  
> @@ -82,9 +82,9 @@
>                  const char *verbatim_code, const char *verbatim_code_end,
>                  unsigned int verbatim_code_lineno, bool charset_dependent,
>                  int total_keys, int max_key_len, int min_key_len,
> -                const Positions& positions, const unsigned int *alpha_inc,
> -                int total_duplicates, unsigned int alpha_size,
> -                const int *asso_values)
> +                bool hash_includes_len, const Positions& positions,
> +                const unsigned int *alpha_inc, int total_duplicates,
> +                unsigned int alpha_size, const int *asso_values)
>    : _head (head), _struct_decl (struct_decl),
>      _struct_decl_lineno (struct_decl_lineno), _return_type (return_type),
>      _struct_tag (struct_tag),
> @@ -97,6 +97,7 @@
>      _charset_dependent (charset_dependent),
>      _total_keys (total_keys),
>      _max_key_len (max_key_len), _min_key_len (min_key_len),
> +    _hash_includes_len (hash_includes_len),
>      _key_positions (positions), _alpha_inc (alpha_inc),
>      _total_duplicates (total_duplicates), _alpha_size (alpha_size),
>      _asso_values (asso_values)
> @@ -755,7 +756,7 @@
>    if (/* The function does not use the 'str' argument?  */
>        _key_positions.get_size() == 0
>        || /* The function uses 'str', but not the 'len' argument?  */
> -         (option[NOLENGTH]
> +         (!_hash_includes_len
>            && _key_positions[0] < _min_key_len
>            && _key_positions[_key_positions.get_size() - 1] != 
> Positions::LASTCHAR))
>      /* Pacify lint.  */
> @@ -817,7 +818,7 @@
>      {
>        /* Trivial case: No key positions at all.  */
>        printf ("  return %s;\n",
> -              option[NOLENGTH] ? "0" : "len");
> +              _hash_includes_len ? "len" : "0");
>      }
>    else
>      {
> @@ -838,7 +839,7 @@
>               contain 'unsigned char's or 'unsigned short's.  */
>  
>            printf ("  return %s",
> -                  option[NOLENGTH] ? "" : "len + ");
> +                  _hash_includes_len ? "len + " : "");
>  
>            if (_key_positions.get_size() == 2
>                && _key_positions[0] == 0
> @@ -873,8 +874,8 @@
>                    "  switch (%s)\n"
>                    "    {\n"
>                    "      default:\n",
> -                  option[NOLENGTH] ? "0" : "len",
> -                  option[NOLENGTH] ? "len" : "hval");
> +                  _hash_includes_len ? "len" : "0",
> +                  _hash_includes_len ? "hval" : "len");
>  
>            while (key_pos != Positions::LASTCHAR && key_pos >= _max_key_len)
>              if ((key_pos = iter.next ()) == PositionIterator::EOS)
> Index: src/output.h
> ===================================================================
> RCS file: /cvsroot/gperf/gperf/src/output.h,v
> retrieving revision 1.23
> diff -u -r1.23 output.h
> --- src/output.h      23 Aug 2008 18:52:48 -0000      1.23
> +++ src/output.h      19 Jan 2009 09:48:42 -0000
> @@ -2,7 +2,7 @@
>  
>  /* Output routines.
>  
> -   Copyright (C) 1989-1998, 2000, 2002-2003 Free Software Foundation, Inc.
> +   Copyright (C) 1989-1998, 2000, 2002-2003, 2009 Free Software Foundation, 
> Inc.
>     Written by Douglas C. Schmidt <address@hidden>
>     and Bruno Haible <address@hidden>.
>  
> @@ -49,6 +49,7 @@
>                                  bool charset_dependent,
>                                  int total_keys,
>                                  int max_key_len, int min_key_len,
> +                                bool hash_includes_len,
>                                  const Positions& positions,
>                                  const unsigned int *alpha_inc,
>                                  int total_duplicates,
> @@ -133,6 +134,8 @@
>    int const             _max_key_len;
>    /* Minimum length of the shortest keyword. */
>    int const             _min_key_len;
> +  /* Whether the hash function includes the length.  */
> +  bool                  _hash_includes_len;
>    /* Key positions.  */
>    Positions const       _key_positions;
>    /* Adjustments to add to bytes add specific key positions.  */
> Index: src/search.cc
> ===================================================================
> RCS file: /cvsroot/gperf/gperf/src/search.cc,v
> retrieving revision 1.41
> diff -u -r1.41 search.cc
> --- src/search.cc     23 Aug 2008 18:52:48 -0000      1.41
> +++ src/search.cc     19 Jan 2009 09:48:43 -0000
> @@ -1,5 +1,5 @@
>  /* Search algorithm.
> -   Copyright (C) 1989-1998, 2000, 2002 Free Software Foundation, Inc.
> +   Copyright (C) 1989-1998, 2000, 2002, 2009 Free Software Foundation, Inc.
>     Written by Douglas C. Schmidt <address@hidden>
>     and Bruno Haible <address@hidden>.
>  
> @@ -64,7 +64,7 @@
>     where Pos is a set of byte positions,
>     each alpha_inc[i] is a nonnegative integer,
>     each asso_values[c] is a nonnegative integer,
> -   len (keyword) is the keyword's length if !option[NOLENGTH], or 0 
> otherwise.
> +   len (keyword) is the keyword's length if _hash_includes_len, or 0 
> otherwise.
>  
>     Theorem 1: If all keywords are different, there is a set Pos such that
>     all tuples (keyword[i] : i in Pos) are different.
> @@ -103,7 +103,7 @@
>       Map (A --> B) := Hom_Set (A, B) is the set of maps from A to B, and
>       S(Pos) is the symmetric group over Pos.
>  
> -   This was the theory for option[NOLENGTH]; if !option[NOLENGTH], slight
> +   This was the theory for !_hash_includes_len; if _hash_includes_len, slight
>     modifications apply:
>       proj1 : String --> Map (Pos --> N) x N
>       proj2 : Map (Pos --> N) x N --> Map (Pos --> N) / S(Pos) x N
> @@ -175,6 +175,9 @@
>                exit (1);
>              }
>        }
> +
> +  /* Determine whether the hash function shall include the length.  */
> +  _hash_includes_len = !(option[NOLENGTH] || (_min_key_len == _max_key_len));
>  }
>  
>  /* ====================== Finding good byte positions ====================== 
> */
> @@ -238,7 +241,7 @@
>  
>    unsigned int count = 0;
>    {
> -    Hash_Table representatives (_total_keys, option[NOLENGTH]);
> +    Hash_Table representatives (_total_keys, !_hash_includes_len);
>      for (KeywordExt_List *temp = _head; temp; temp = temp->rest())
>        {
>          KeywordExt *keyword = temp->first();
> @@ -590,7 +593,7 @@
>  
>    unsigned int count = 0;
>    {
> -    Hash_Table representatives (_total_keys, option[NOLENGTH]);
> +    Hash_Table representatives (_total_keys, !_hash_includes_len);
>      for (KeywordExt_List *temp = _head; temp; temp = temp->rest())
>        {
>          KeywordExt *keyword = temp->first();
> @@ -736,7 +739,7 @@
>    _max_selchars_length = _key_positions.iterator(_max_key_len).remaining();
>  
>    /* Check for duplicates, i.e. keywords with the same _selchars array
> -     (and - if !option[NOLENGTH] - also the same length).
> +     (and - if _hash_includes_len - also the same length).
>       We deal with these by building an equivalence class, so that only
>       1 keyword is representative of the entire collection.  Only this
>       representative remains in the keyword list; the others are accessible
> @@ -747,7 +750,7 @@
>      _list_len = _total_keys;
>      _total_duplicates = 0;
>      /* Make hash table for efficiency.  */
> -    Hash_Table representatives (_list_len, option[NOLENGTH]);
> +    Hash_Table representatives (_list_len, !_hash_includes_len);
>  
>      KeywordExt_List *prev = NULL; /* list node before temp */
>      for (temp = _head; temp; )
> @@ -847,7 +850,7 @@
>  
>    /* Given the bound for _asso_values[c], we have a bound for the possible
>       hash values, as computed in compute_hash().  */
> -  _max_hash_value = (option[NOLENGTH] ? 0 : _max_key_len)
> +  _max_hash_value = (_hash_includes_len ? _max_key_len : 0)
>                      + (_asso_value_max - 1) * _max_selchars_length;
>    /* Allocate a sparse bit vector for detection of collisions of hash
>       values.  */
> @@ -1307,7 +1310,7 @@
>                       the yet undetermined asso_values[].  */
>                    int hashcode;
>                    {
> -                    int sum = option[NOLENGTH] ? 0 : 
> keyword->_allchars_length;
> +                    int sum = _hash_includes_len ? keyword->_allchars_length 
> : 0;
>                      const unsigned int *p = keyword->_selchars;
>                      int i = keyword->_selchars_length;
>                      for (; i > 0; p++, i--)
> @@ -1407,7 +1410,7 @@
>                            _asso_value_max = step->_asso_value_max;
>                            /* Reinitialize _max_hash_value.  */
>                            _max_hash_value =
> -                            (option[NOLENGTH] ? 0 : _max_key_len)
> +                            (_hash_includes_len ? _max_key_len : 0)
>                              + (_asso_value_max - 1) * _max_selchars_length;
>                            /* Reinitialize _collision_detector.  */
>                            delete _collision_detector;
> @@ -1470,7 +1473,7 @@
>  inline int
>  Search::compute_hash (KeywordExt *keyword) const
>  {
> -  int sum = option[NOLENGTH] ? 0 : keyword->_allchars_length;
> +  int sum = _hash_includes_len ? keyword->_allchars_length : 0;
>  
>    const unsigned int *p = keyword->_selchars;
>    int i = keyword->_selchars_length;
> Index: src/search.h
> ===================================================================
> RCS file: /cvsroot/gperf/gperf/src/search.h,v
> retrieving revision 1.27
> diff -u -r1.27 search.h
> --- src/search.h      23 Aug 2008 18:52:48 -0000      1.27
> +++ src/search.h      19 Jan 2009 09:48:43 -0000
> @@ -2,7 +2,7 @@
>  
>  /* Search algorithm.
>  
> -   Copyright (C) 1989-1998, 2000, 2002 Free Software Foundation, Inc.
> +   Copyright (C) 1989-1998, 2000, 2002, 2009 Free Software Foundation, Inc.
>     Written by Douglas C. Schmidt <address@hidden>
>     and Bruno Haible <address@hidden>.
>  
> @@ -113,6 +113,9 @@
>    /* Minimum length of the shortest keyword.  */
>    int                   _min_key_len;
>  
> +  /* Whether the hash function includes the length.  */
> +  bool                  _hash_includes_len;
> +
>    /* User-specified or computed key positions.  */
>    Positions             _key_positions;
>  
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]