bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gperf-bugs] Patch to gperf 2.7.2 for case-independent keyword matc


From: Bruce Lilly
Subject: Re: [Gperf-bugs] Patch to gperf 2.7.2 for case-independent keyword matching
Date: Fri, 15 Feb 2002 13:09:32 -0500

Paul Jarc wrote:
> 
> Bruce Lilly <address@hidden> wrote:
> > Case conversion of keywords precludes storing them in a canonical
> > form, e.g. "MIME-Version"
> 
> Why is that a problem?

In some applications, it is a requirement to return a pointer
to a canonical form rather than all-lower-case or ALL-UPPER-CASE.

> > and additional function calls,
> 
> Counting function calls without regard to which functions are being
> called and how expensive they are doesn't seem very informative.

Contrary to your implication, I have explicitly taken into
account the expense of the calls; a function which copies
with conversion is more expensive than a comparison.
Suggest you carefully reread.

> > Moreover, both the gperf-generated case-dependent lookup
> > function and the application interface are externally visible
> > functions, which may result in namespace collisions or the use of
> > the wrong function, and is poor programming practice (it would
> > help is there were a gperf option to make the gperf-generated
> > function static).
> 
> This is orthogonal to the case-comparison issue, right?

Related by the previous discussion, where somebody's "recommended
way" of addressing case-insensitive lookups inevitably result in
the issues mentioned, vs. the submitted patch which avoids them.

> > 4. successful lookup, no hash collisions.
> >    The one successful strcasecmp call is still cheaper than your
> >    "recommended way" of copying and converting the string since
> >    strcasecmp does no copying and requires no additional function
> >    call overhead.
> 
> Lookups are typically much more common than the building of hash
> tables.  The extra work you avoid happens less frequently; the extra
> work you introduce happens more frequently.  It's not sufficient to
> simply compare the amount of work as seen in the code; each cost
> should be weighted according to the frequency of its incursion.

?? "less frequently", "more frequently" -- than what?
Read again; the analysis in terms of cost and frequency is complete
and extends through the entire 5 cases analyzed, not merely in the
one quoted.  Moreover each lookup in the so-called "recommended way"
involves copying and converting each character in the string.
Suggest you build and time a test case using both methods and compare.

Nobody has yet produced any hard analysis (as opposed to idle
speculation) that indicates that the patch is anything other
than an improvement.

> paul

Best regards,
  Bruce Lilly



reply via email to

[Prev in Thread] Current Thread [Next in Thread]