bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gperf wrapper


From: Bruce Korb
Subject: Re: gperf wrapper
Date: Tue, 13 Nov 2007 20:10:49 -0800
User-agent: Thunderbird 2.0.0.6 (X11/20070801)

Bruno Haible wrote:
> This is certainly a good use of gperf, and since the same coding pattern
> occurs several times in your environment, it sure is good to have a tool
> for it (may be a C program, shell script, or substitution into some template
> files, or similar).

As Eric Raymond likes to say, there are many ways of doing the same thing.

> Your question is whether this would be useful to be packaged as part of
> "gperf". So let's consider:
>   1) How many people have the same problem as you have?

Where I work, that problem is solved hundreds of times with
hand crafted code that compares name strings one at a time
for lists of names that can be hundreds long.

>   2) How many of those would be happy with what your tool generates?

The cascading if's could all be replaced with ``case FOO_KWD_NAME:''
But not now, of course.  "Before" when the code was first being
written.

>   3) How hard would it be for them to not use your tool?
>
> Ad 1) How many people have the same problem as you have?
> 
> Just for comparison, the 4 uses that I've made of gperf recently:
> 
>   struct alias { int name; unsigned int encoding_index; };
>   struct mapping { int standard_name; const char vendor_name[10 + 1]; };
>   struct named_property { const char *name; uc_property_t property; };
>   struct named_script { const char *name; unsigned int index; };
> 
> So, 2 out of 4 uses map the string to a nonnegative index. In these cases,
> the index was then an index into a private table. I didn't need an 'enum'
> type in these cases.

No, but do be careful to keep them synchronized.  Especially fun with
large projects with lots of hands.  And new hands, too.

> But when someone needs an enum type (i.e. when a public API without
> extensibility is required), your approach is OK.

I use enums in non-public API's merely to clarify the purpose of
the numeric value.  New people within the group need to figure out
just what does 3 mean anyway.  Yes, #defines do work also, but
then you get into debugability at the GDB level.  Anyway, taking
your first example:
>   struct alias { int name; unsigned int encoding_index; };
here you are providing a string name to be associated with each
index.  Whether you enumerate those indexes with 1, 2, 3, ...
or FOO_ALPHA, FOO_BETA, FOO_GAMMA, ... really does not matter.
Except, perhaps, to the person who inherits your code and sees
a '3' pop up  in gdb instead of FOO_GAMMA.  IOW, it is religious.

> Ad 2) How many of those would be happy with what your tool generates?
> 
> Lots of code would not need to do a lookup of a function pointer and call
> the function immediately, but rather do this in separate steps. Your
> generated code would be more generally useful if it returned a function
> pointer:
>    return dispatch[id];
> instead of
>    return dispatch[id](a1, a2);

Well, actually, gperf already lets you put a proc pointer in the
lookup data structure.  Of course, it's the address of that entry
that is returned.  Actually, thinking about it, I'd junk my script
and use straight gperf with two tiny little enhancements :)

1.  allow for some "base-name" argument that would automatically
    get applied to all the external names.  I don't care for
    having to specify each individually.  e.g.:
        %define slot-name               ${prefix}_name
        %define hash-function-name      ${base_name}_hash
        %define lookup-function-name    find_${base_name}_name
        %define word-array-name         ${base_name}_table
        %define initializer-suffix      ,${PREFIX}_COUNT_KWD

2.  allow the lookup function to return the second field of the
    lookup structure.  You might specify it thus:

    %return-type  type_t attributes retval

if that is specified, then gperf spins its own lookup data structure
with two fields:

  struct gperf_internal_name {
    char const * ${slot_name};
    type_t attributes retval;
  };

> Some code goes into shared libraries and would require to use %pic.
> 
> Some header files need to be usable in a mixed C/C++ environment and
> therefore need conditional 'extern "C"' markers.
> 
> Some projects would want to copy the declarations into an existing header
> file, rather than creating a new one.

Adding guarded ``extern "C"'' markers and tweaking code to be happy
with a %pic is mostly just polish stuff.  I've not needed it
so I've not added that polish.  Besides, nobody would polish it
unless it were seen as useful.  Heck, with the above two enhancements
I'm not sure I'd continue using it.

> Ad 3) How hard would it be for them to not use your tool?
> 
> Is using gperf directly that hard? No, the documentation and --help
> message explain the use of gperf quite well, IMO.

It's mostly the nuisance thing.  I'm sure you're completely familiar
with gperf.  I start with the need to map a series of names into
a number.  An enumeration of the names, if you will.  Now I want to
call a function that tells me which string I have.  Very simple need.
How long from this point to working code and on to next issue?  More
time than it would take to use cascading if's.  (Either being less
time than it takes to write emails like this.)

Most folks I know use cascading ``else if (strcmp() == 0)'' constructs
because anything else (read: gperf) is too much bother.  Many, many
thousands of lines of that kind of stuff.  I'll do anything else because
those cascading if's grate my nerves.  :)  So, I hacked out the script
and people around me say, "Yeah.  Ok.  That's easy enough." but won't
use gperf.  Please don't argue with me about it tho.  Certainly I don't
think gperf is rocket science.  I think it is just perceived level of
hassle.

> Is generating a function template or header file template hard? No. It
> does not even need automation. Many people can certainly also do it
> in a text editor, with copy and query-replace.

Manual maintenance of multiple parallel things is a bad thing.
It leads to mistakes.  It is not that it is a hard task; it is
that it is hard to be certain that it is always properly maintained.
If there is only one list of names to maintain (and likely not in the
direct input to gperf), then that one list cannot get out of order
with respect to itself.  It is also different if you are completely
certain you'll never need to alter the name list.  I'm certain I
don't want to worry about whether the list is immutable or not. :)

Anyway, I think it comes down to familiarity with your own set of
usual tools.  I think gperf is perceived as an esoteric tool that
is a nuisance.  Making it dead simple to use in the contexts where
I see it being highly useful would lead to greater usage.  Whether
that is accomplished with a wrapper script or yet more options is
the main debate.

So, thank you for maintaining the thing.  I like it.  :)

Cheers - Bruce

P.S.
Oh, by the way, it would also be useful to copy the ``%xxx'' parameters
into the output file, like the command line options.  I put them into the
.gperf file twice, once as follows and repeated later in the more normal
fashion:

#if 0 /* gperf build options: */
// %struct-type
// %language=ANSI-C
// %includes
// %global-table
// %omit-struct-type
// %readonly-tables
// %enum
// %compare-strncmp
//
// %define slot-name               cm_name
// %define hash-function-name      cm_opt_hash
// %define lookup-function-name    find_cm_opt_name
// %define word-array-name         cm_opt_table
// %define initializer-suffix      ,CM_COUNT_KWD
#endif /* gperf build options: */




reply via email to

[Prev in Thread] Current Thread [Next in Thread]