Re: mbcel module for Gnulib?, incomplete multibyte sequences

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mbcel module for Gnulib?, incomplete multibyte sequences

From:	Paul Eggert
Subject:	Re: mbcel module for Gnulib?, incomplete multibyte sequences
Date:	Thu, 27 Jul 2023 22:16:52 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

On 2023-07-27 12:19, Paul Eggert wrote:


  --- a/lib/mbcel.h
  +++ b/lib/mbcel.h
  @@ -191,3 +191,3 @@ mbcel_scan (char const *p, char const *lim)
     if (_GL_UNLIKELY ((size_t) -1 / 2 < len))
  -    return (mbcel_t) { .err = *p, .len = 1 };

+ return (mbcel_t) { .err = *p, .len = len == (size_t) -2 ? lim - p: 1 };

Come to think of it, this would merely make mbcel compatible withmbu?iterf?, by causing mbcel to return a length greater than 1 given anincomplete character at input end. But even with this change, mbcelwould still not implement the multi-byte-per-encoding-errorinterpretation ("MEE") behavior that Kuhn and/or the Unicode standarddescribe. This is because mbu?iterf? doesn't implement MEE either.

For MEE, mbiterf would need something like the attached untested patch,and mbiter, mbcel, etc. would all need similar patches. I'm notsuggesting that we make this change, though, as it would bloat the codefor little benefit to many callers.

It would be better to change mbu?iterf? to usesingle-byte-per-encoding-error ("SEE") behavior, as this is simpler andis more consistent with how Emacs etc. behave. Any programs that needMEE can implement it themselves, or if the need is common enough wecould add a Gnulib API that an app can use to support MEE whenmbiter/mbcel etc. indicate an encoding error.

mbiterf-mee.diff
Description: Text Data

[Prev in Thread]

Current Thread

[Next in Thread]

Re: mbcel module for Gnulib?, incomplete multibyte sequences, (continued)

Prev by Date: Re: timespec_get: port to Ubuntu 23.04
Next by Date: mbmemcasecmp, mbmemcasecoll: Avoid test failure on MSVC
Previous by thread: Re: mbcel module for Gnulib?, incomplete multibyte sequences
Next by thread: Re: mbcel module for Gnulib?, incomplete multibyte sequences
Index(es):
- Date
- Thread