|
From: | Bruno Haible |
Subject: | new module iswpunct |
Date: | Wed, 30 Aug 2023 02:28:48 +0200 |
On Android, I see this test failure: FAIL: test-iswctype =================== ../../gnulib-tests/test-iswctype.c:56: assertion 'iswctype (L'$', desc)' failed Aborted FAIL test-iswctype (exit status: 134) While the set of characters of the class "punct" in the C locale is not specified by POSIX [1][2][3], it is a reasonable expectation that ispunct() and iswpunct() are consistent. Which is not the case on Android: This program =============================================================================== #include <stdio.h> #include <ctype.h> #include <wctype.h> int main () { printf ("%c : %d %d %d\n", 33, !!ispunct (33), !!iswpunct (33), !!iswctype (33, wctype ("punct"))); printf ("%c : %d %d %d\n", 34, !!ispunct (34), !!iswpunct (34), !!iswctype (34, wctype ("punct"))); printf ("%c : %d %d %d\n", 35, !!ispunct (35), !!iswpunct (35), !!iswctype (35, wctype ("punct"))); printf ("%c : %d %d %d\n", 36, !!ispunct (36), !!iswpunct (36), !!iswctype (36, wctype ("punct"))); printf ("%c : %d %d %d\n", 37, !!ispunct (37), !!iswpunct (37), !!iswctype (37, wctype ("punct"))); printf ("%c : %d %d %d\n", 38, !!ispunct (38), !!iswpunct (38), !!iswctype (38, wctype ("punct"))); printf ("%c : %d %d %d\n", 39, !!ispunct (39), !!iswpunct (39), !!iswctype (39, wctype ("punct"))); printf ("%c : %d %d %d\n", 40, !!ispunct (40), !!iswpunct (40), !!iswctype (40, wctype ("punct"))); printf ("%c : %d %d %d\n", 41, !!ispunct (41), !!iswpunct (41), !!iswctype (41, wctype ("punct"))); printf ("%c : %d %d %d\n", 42, !!ispunct (42), !!iswpunct (42), !!iswctype (42, wctype ("punct"))); printf ("%c : %d %d %d\n", 43, !!ispunct (43), !!iswpunct (43), !!iswctype (43, wctype ("punct"))); printf ("%c : %d %d %d\n", 44, !!ispunct (44), !!iswpunct (44), !!iswctype (44, wctype ("punct"))); printf ("%c : %d %d %d\n", 45, !!ispunct (45), !!iswpunct (45), !!iswctype (45, wctype ("punct"))); printf ("%c : %d %d %d\n", 46, !!ispunct (46), !!iswpunct (46), !!iswctype (46, wctype ("punct"))); printf ("%c : %d %d %d\n", 47, !!ispunct (47), !!iswpunct (47), !!iswctype (47, wctype ("punct"))); printf ("%c : %d %d %d\n", 58, !!ispunct (58), !!iswpunct (58), !!iswctype (58, wctype ("punct"))); printf ("%c : %d %d %d\n", 59, !!ispunct (59), !!iswpunct (59), !!iswctype (59, wctype ("punct"))); printf ("%c : %d %d %d\n", 60, !!ispunct (60), !!iswpunct (60), !!iswctype (60, wctype ("punct"))); printf ("%c : %d %d %d\n", 61, !!ispunct (61), !!iswpunct (61), !!iswctype (61, wctype ("punct"))); printf ("%c : %d %d %d\n", 62, !!ispunct (62), !!iswpunct (62), !!iswctype (62, wctype ("punct"))); printf ("%c : %d %d %d\n", 63, !!ispunct (63), !!iswpunct (63), !!iswctype (63, wctype ("punct"))); printf ("%c : %d %d %d\n", 64, !!ispunct (64), !!iswpunct (64), !!iswctype (64, wctype ("punct"))); printf ("%c : %d %d %d\n", 91, !!ispunct (91), !!iswpunct (91), !!iswctype (91, wctype ("punct"))); printf ("%c : %d %d %d\n", 92, !!ispunct (92), !!iswpunct (92), !!iswctype (92, wctype ("punct"))); printf ("%c : %d %d %d\n", 93, !!ispunct (93), !!iswpunct (93), !!iswctype (93, wctype ("punct"))); printf ("%c : %d %d %d\n", 94, !!ispunct (94), !!iswpunct (94), !!iswctype (94, wctype ("punct"))); printf ("%c : %d %d %d\n", 95, !!ispunct (95), !!iswpunct (95), !!iswctype (95, wctype ("punct"))); printf ("%c : %d %d %d\n", 96, !!ispunct (96), !!iswpunct (96), !!iswctype (96, wctype ("punct"))); printf ("%c : %d %d %d\n", 123, !!ispunct (123), !!iswpunct (123), !!iswctype (123, wctype ("punct"))); printf ("%c : %d %d %d\n", 124, !!ispunct (124), !!iswpunct (124), !!iswctype (124, wctype ("punct"))); printf ("%c : %d %d %d\n", 125, !!ispunct (125), !!iswpunct (125), !!iswctype (125, wctype ("punct"))); printf ("%c : %d %d %d\n", 126, !!ispunct (126), !!iswpunct (126), !!iswctype (126, wctype ("punct"))); return 0; } =============================================================================== prints: ! : 1 1 1 " : 1 1 1 # : 1 1 1 $ : 1 0 0 % : 1 1 1 & : 1 1 1 ' : 1 1 1 ( : 1 1 1 ) : 1 1 1 * : 1 1 1 + : 1 0 0 , : 1 1 1 - : 1 1 1 . : 1 1 1 / : 1 1 1 : : 1 1 1 ; : 1 1 1 < : 1 0 0 = : 1 0 0 > : 1 0 0 ? : 1 1 1 @ : 1 1 1 [ : 1 1 1 \ : 1 1 1 ] : 1 1 1 ^ : 1 0 0 _ : 1 1 1 ` : 1 0 0 { : 1 1 1 | : 1 0 0 } : 1 1 1 ~ : 1 0 0 That is, the characters '$', '+', '<', '=', '>', '^', '`', '|', '~' are not considered to be in class "punct" by the wide-character APIs iswpunct(), iswctype(). Here's a set of patches that provides a workaround. So that, in particular, the [[:punct:]] syntax in fnmatch and regex will work consistently. [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/ispunct.html [2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/iswpunct.html [3] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html 2023-08-29 Bruno Haible <bruno@clisp.org> wctype: Rely on module iswpunct. * m4/wctype.m4 (gl_FUNC_WCTYPE): Also test whether the "punct" class works. * modules/wctype (Depends-on): Add iswpunct. * tests/test-iswctype.c (main): Add more tests of the "punct" class. * doc/posix-functions/wctype.texi: Mention the Android problem. c32ispunct: Rely on module iswpunct. * modules/c32ispunct (Depends-on): Add iswpunct. * tests/test-c32ispunct.c (main): Add a few more tests in the "C" locale. iswpunct: Add tests. * tests/test-iswpunct.c: New file, based on tests/test-iswdigit.c and tests/test-c32ispunct.c. * tests/test-iswpunct.sh: New file, based on tests/test-iswdigit.sh. * modules/iswpunct-tests: New file. iswpunct: New module. * lib/wctype.in.h (iswpunct): New declaration. * lib/iswpunct.c: New file. * m4/iswpunct.m4: New file. * m4/wctype_h.m4 (gl_WCTYPE_H_REQUIRE_DEFAULTS): Initialize GNULIB_ISWPUNCT. (gl_WCTYPE_H_DEFAULTS): Initialize REPLACE_ISWPUNCT. * modules/wctype-h (Makefile.am): Substitute GNULIB_ISWPUNCT, REPLACE_ISWPUNCT. * modules/iswpunct: New file. * doc/posix-functions/iswpunct.texi: Mention the new module. wctype-h tests: Add more tests. * tests/test-wctype-h.c (main): Add a sanity check of iswpunct.
0001-wctype-h-tests-Add-more-tests.patch
Description: Text Data
0002-iswpunct-New-module.patch
Description: Text Data
0003-iswpunct-Add-tests.patch
Description: Text Data
0004-c32ispunct-Rely-on-module-iswpunct.patch
Description: Text Data
0005-wctype-Rely-on-module-iswpunct.patch
Description: Text Data
[Prev in Thread] | Current Thread | [Next in Thread] |