bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: OmniOS CI failure for test-fnmatch-5.sh


From: Bruno Haible
Subject: Re: OmniOS CI failure for test-fnmatch-5.sh
Date: Tue, 21 May 2024 10:39:24 +0200

Hi Collin,

> I was taking a look at the CI stuff you wrote. Nice work, it seems
> really helpful.

Thanks for helping! The goal here being to get all FAILs either fixed
or declared as XFAIL, this really helps.

> The tests pass with that one check #ifdef'd out. Here is the diff I
> used:
> 
> $ diff -u tests/test-fnmatch.c testdir1/gltests/test-fnmatch.c 
> --- tests/test-fnmatch.c        2024-05-20 01:15:09.806829699 -0700
> +++ testdir1/gltests/test-fnmatch.c     2024-05-21 00:57:41.935481891 -0700
> @@ -893,7 +893,7 @@
>          /* U+20000 <CJK Ideograph> */
>          ASSERT (fnmatch ("x[[:print:]]y", "x\225\062\202\066y", 0) == 0);
>          #endif
> -        #if !(defined __FreeBSD__ || defined __DragonFly__)
> +        #if !(defined __FreeBSD__ || defined __DragonFly__ || defined 
> __illumos__)
>          /* U+00D7 MULTIPLICATION SIGN */
>          ASSERT (fnmatch ("x[[:punct:]]y", "x\241\301y", 0) == 0);
>          #endif

Looks good. OK to push.

Different systems define their character classifications differently;
we use the #if here to clarify what Gnulib users can expect and what
expectations are not warranted.

Since this failure is not seen on Solaris 11.4, the 'defined __illumos__'
is correct.

> I think it might be a bug in OmniOS's handling of GB 18030 but I am
> unsure.

On these systems, every locale has its own character classifications
tables. It can happen that they treat U+00D7 MULTIPLICATION SIGN
as a punctuation character in the UTF-8 locale but not in the GB18030
locale. (Glibc avoids this by computing the character classifications
tables for all encodings from a single source.)

> Maybe I am missing a good English specification or lack the
> ability to learn Mandarin...

You don't need to understand Mandarin in order to work GB18030.
GB18030 is an ASCII-based encoding of all Unicode, like UTF-8, just with
a different mapping table.

Bruno






reply via email to

[Prev in Thread] Current Thread [Next in Thread]