[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gperf-bugs] Patch to gperf 2.7.2 for case-independent keyword matc
From: |
Bruce Lilly |
Subject: |
Re: [Gperf-bugs] Patch to gperf 2.7.2 for case-independent keyword matching |
Date: |
Fri, 15 Feb 2002 13:09:32 -0500 |
Paul Jarc wrote:
>
> Bruce Lilly <address@hidden> wrote:
> > Case conversion of keywords precludes storing them in a canonical
> > form, e.g. "MIME-Version"
>
> Why is that a problem?
In some applications, it is a requirement to return a pointer
to a canonical form rather than all-lower-case or ALL-UPPER-CASE.
> > and additional function calls,
>
> Counting function calls without regard to which functions are being
> called and how expensive they are doesn't seem very informative.
Contrary to your implication, I have explicitly taken into
account the expense of the calls; a function which copies
with conversion is more expensive than a comparison.
Suggest you carefully reread.
> > Moreover, both the gperf-generated case-dependent lookup
> > function and the application interface are externally visible
> > functions, which may result in namespace collisions or the use of
> > the wrong function, and is poor programming practice (it would
> > help is there were a gperf option to make the gperf-generated
> > function static).
>
> This is orthogonal to the case-comparison issue, right?
Related by the previous discussion, where somebody's "recommended
way" of addressing case-insensitive lookups inevitably result in
the issues mentioned, vs. the submitted patch which avoids them.
> > 4. successful lookup, no hash collisions.
> > The one successful strcasecmp call is still cheaper than your
> > "recommended way" of copying and converting the string since
> > strcasecmp does no copying and requires no additional function
> > call overhead.
>
> Lookups are typically much more common than the building of hash
> tables. The extra work you avoid happens less frequently; the extra
> work you introduce happens more frequently. It's not sufficient to
> simply compare the amount of work as seen in the code; each cost
> should be weighted according to the frequency of its incursion.
?? "less frequently", "more frequently" -- than what?
Read again; the analysis in terms of cost and frequency is complete
and extends through the entire 5 cases analyzed, not merely in the
one quoted. Moreover each lookup in the so-called "recommended way"
involves copying and converting each character in the string.
Suggest you build and time a test case using both methods and compare.
Nobody has yet produced any hard analysis (as opposed to idle
speculation) that indicates that the patch is anything other
than an improvement.
> paul
Best regards,
Bruce Lilly