[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
sed 4.2 warnings due to poor ctype usage
From: |
Eric Blake |
Subject: |
sed 4.2 warnings due to poor ctype usage |
Date: |
Mon, 11 May 2009 18:14:36 +0000 (UTC) |
User-agent: |
Loom/3.14 (http://gmane.org/) |
Sed 4.2 triggers some compilation warnings on cygwin, due to its use of passing
plain char arguments to ctype macros - these warnings exist as a QoI issue to
detect potential bugs on platforms where char is signed, but negative indices
are not handled, as well as the issue where isspace((char)0xff) and isspace
((unsigned char)0xff) give different results in some locales. For an example
of the warnings while compiling sed:
compile.c: In function `setup_replacement':
compile.c:807: warning: subscript has type `char'
compile.c: In function `normalize_text':
compile.c:1519: warning: subscript has type `char'
It turns out that the two usages above actually go through ISDIGIT, which in
turn is currently defined as (ISASCII(c) && isdigit(c)); so it depends on
whether STDC_HEADERS was defined (which determines whether ISASCII is 1 or a
call to isascii()) as to whether this ends up using isdigit on a negative value
(undefined behavior) or whether the compilation warning can be safely ignored
(since 7-bit char values never promote to a negative argument to isdigit).
Then there is the matter that POSIX states that the use of isascii is obsolete,
in part because it ignores locale issues, and in part because it is broken for
EBCDIC encodings; and getting rid of compilation warnings seems like it would
be worthwhile.
You really should consider using the gnulib module c-ctype (which does not
suffer from problems of passing a plain char through the macros) if you don't
care about locale, and/or getting rid of the use of isascii. For that matter,
coreutils still has ISDIGIT for speed and to cater to non-POSIX platforms, but
defers to locale macros for all other categories or uses c-ctype when locale is
undesired (for example, uses of ISSPACE should probably be isspace or c_isspace
instead). The coreutils definition is:
/* ISDIGIT differs from isdigit, as follows:
- Its arg may be any int or unsigned int; it need not be an unsigned char
or EOF.
- It's typically faster.
POSIX says that only '0' through '9' are digits. Prefer ISDIGIT to
isdigit unless it's important to use the locale's definition
of `digit' even when the host does not conform to POSIX. */
#define ISDIGIT(c) ((unsigned int) (c) - '0' <= 9)
--
Eric Blake
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- sed 4.2 warnings due to poor ctype usage,
Eric Blake <=