bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: win32 diff (GNU diffutils) 2.8.1 "--ignore-file-name-case" switch do


From: Paul Eggert
Subject: Re: win32 diff (GNU diffutils) 2.8.1 "--ignore-file-name-case" switch doesn't work
Date: 11 Jan 2004 22:01:37 -0800
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3

"Eli Zaretskii" <address@hidden> writes:

> > From: Paul Eggert <address@hidden>
> > Date: 10 Jan 2004 23:18:44 -0800
> > 
> > What is actually needed is a "strcasecoll" routine, which compares
> > file names in a case-insensitive way in arbitrary locales.
> 
> Right, but as long as such a beast isn't available, isn't it more
> correct to return zero if strcasecmp finds the names equal?

(Hmm, "more correct"?  :-)

I don't know what strcasecmp does in non-"C" locales, so it's hard for
me to say.  Perhaps it handles simple unibyte locales correctly, but
perhaps not.  (In Solaris 9, it doesn't: it assumes ASCII.)  Either
way, I suspect strcasecmp invariably mishandles multibyte locales, so
in those cases it's not correct.

> What situation do you envision where two strings that are equal for
> strcasecmp are not equal in a non-Posix locale?

That could happen in Shift-JIS locales, since the second byte of a
Shift-JIS character might look like a valid ASCII letter.

> > For a similar case I've been thinking of using the patch described in
> > <http://mail.gnu.org/archive/html/bug-gnu-utils/2002-12/msg00079.html>.
> > However, given your discussion it seems that this patch isn't correct
> > either....
> 
> Hmm.. why not?

Because that patch falls back on strcasecmp/file_name_cmp if
strcasecoll/strcoll returns zero.  That's not correct: if strcasecoll
returns zero, compare_names should return zero without falling back on
strcasecmp/file_name_cmp.

> As the last resort, how about supporting case-insensitive file-name
> comparisons only in the C locale?

Yes, that's pretty much the best we can do, and it was the intent of
that patch.

> For that matter, why not using strcasecmp alone when we run under
> the C locale?

That goes too far, I think.  For example, traditional strcasecmp does
something useful in the en_US.utf8 locale: it ignores ASCII case even
if it doesn't ignore case on non-ASCII letters.  (POSIX says the
result of strcasecmp is unspecified outside the POSIX locale, but
that's OK.)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]