diffutils-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: From wchar_t to char32_t, new module mbszero


From: Bruno Haible
Subject: Re: From wchar_t to char32_t, new module mbszero
Date: Sun, 16 Jul 2023 10:43:34 +0200

Paul Eggert wrote:
> > By reading the source code of FreeBSD, NetBSD, OpenBSD, macOS, Solaris,
> > and so on, I can easily determine
> >    - which parts of the mbstate_t mbsinit() tests,
> >    - which parts of the mbstate_t the various functions use.
> > But in order to understand what interdependencies there are, between
> > the various mbstate_t fields, and what are the assumed invariants,
> > I would need to carefully read each of the mentioned files (one per
> > OS and per locale type).
> 
> Yes, and I did that for mbcel - that is, I looked at the source code for 
> every coding system used by mbrtoc32 on NetBSD, OpenBSD, FreeBSD, 
> Darwin, and DragonFly. The analysis was not as hard as one might think, 
> as mbrtoc32 quickly decides whether the state is initial, and mbrtoc32 
> is all that matters for mbcel.
> 
> I doubt whether other primitives like mbrlen would differ, though I did 
> not check this. Also, it's possible I made a mistake in analyzing 
> mbrtoc32, though I hope that's unlikely.

I did that analysis again, more carefully than previously, and found
that for macOS, FreeBSD, NetBSD, OpenBSD, Solaris, zeroing the first
12 bytes of the mbstate_t should be sufficient. (Like you said.)

However, after implementing mbszero with this data and enabling its use
in many places, I got test failures on NetBSD and Solaris.
  - On NetBSD, the minimum we need to clear is 28 bytes.
  - On Solaris OmniOS and OpenIndiana, the minimum we need to clear is 16 bytes.
  - On proprietary Solaris, the minimum we need to clear is 20 or 28 bytes
    (depending on 32-bit or 64-bit mode).
So, clearly this is fragile stuff. I'm committing it nevertheless, since it
seems that we have a good enough test coverage to detect future changes.


2023-07-16  Bruno Haible  <bruno@clisp.org>

        dfa: Optimize clearing an mbstate_t.
        * lib/dfa.c (mbszero) [GAWK]: Add fallback definition.
        (mbs_to_wchar, lex, addtok_wc, dfaexec_main): Use mbszero.
        * modules/dfa (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        uchar-c23: Optimize clearing an mbstate_t.
        * lib/lc-charset-unicode.c (locale_encoding_to_unicode,
        unicode_to_locale_encoding): Use mbszero.
        * modules/uchar-c23 (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        quotearg: Optimize clearing an mbstate_t.
        * lib/quotearg.c: Include <wchar.h>.
        (quotearg_buffer_restyled): Use mbszero.
        * modules/quotearg (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        vasnprintf, vasnwprintf: Optimize clearing an mbstate_t.
        * lib/vasnprintf.c (VASNPRINTF): Use mbszero.
        * modules/vasnprintf (Depends-on): Add mbszero.
        * modules/vasnwprintf (Depends-on): Likewise.
        * modules/c-vasnprintf (Depends-on): Likewise.
        * modules/unistdio/u8-vasnprintf (Depends-on): Likewise.
        * modules/unistdio/u8-u8-vasnprintf (Depends-on): Likewise.
        * modules/unistdio/u16-vasnprintf (Depends-on): Likewise.
        * modules/unistdio/u16-u16-vasnprintf (Depends-on): Likewise.
        * modules/unistdio/u32-vasnprintf (Depends-on): Likewise.
        * modules/unistdio/u32-u32-vasnprintf (Depends-on): Likewise.
        * modules/unistdio/ulc-vasnprintf (Depends-on): Likewise.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        mbmemcasecoll: Optimize clearing an mbstate_t.
        * lib/mbmemcasecoll.c (apply_c32tolower): Use mbszero.
        * modules/mbmemcasecoll (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        mbswidth: Optimize clearing an mbstate_t.
        * lib/mbswidth.c (mbsnwidth): Use mbszero.
        * modules/mbswidth (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        mbfile: Optimize clearing an mbstate_t.
        * lib/mbfile.h (mbfile_multi_getc, mbf_init): Use mbszero.
        * modules/mbfile (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        mbuiter: Optimize clearing an mbstate_t.
        * lib/mbuiter.h: Include <wchar.h>.
        (mbuiter_multi_next, mbuiter_multi_copy, mbui_init): Use mbszero.
        * modules/mbuiter (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        mbiter: Optimize clearing an mbstate_t.
        * lib/mbiter.h: Include <wchar.h>.
        (mbiter_multi_next, mbiter_multi_copy, mbi_init): Use mbszero.
        * modules/mbiter (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        c32stombs: Optimize clearing an mbstate_t.
        * lib/c32stombs.c (c32stombs): Use mbszero.
        * lib/uchar.in.h (c32stombs): Likewise.
        * modules/c32stombs (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        mbstoc32s: Optimize clearing an mbstate_t.
        * lib/mbstoc32s.c (mbstoc32s): Use mbszero.
        * lib/uchar.in.h (mbstoc32s): Likewise.
        * modules/mbstoc32s (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        mbstowcs: Optimize clearing an mbstate_t.
        * lib/mbstowcs.c (mbstowcs): Use mbszero.
        * modules/mbstowcs (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        c32tob: Optimize clearing an mbstate_t.
        * lib/c32tob.c (c32tob): Use mbszero.
        * modules/c32tob (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        wctomb: Optimize clearing an mbstate_t.
        * lib/wctomb-impl.h (wctomb): Use mbszero.
        * modules/wctomb (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        btoc32: Optimize clearing an mbstate_t.
        * lib/btoc32.c: Include <wchar.h>.
        (btoc32): Use mbszero.
        * modules/btoc32 (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        btowc: Optimize clearing an mbstate_t.
        * lib/btowc.c (btowc): Use mbszero.
        * modules/btowc (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        mbrtoc32: Optimize clearing an mbstate_t.
        * lib/mbrtoc32.c (mbrtoc32): Use mbszero.
        * modules/mbrtoc32 (Depends-on): Add mbsinit, mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        mbtowc: Optimize clearing an mbstate_t.
        * lib/mbtowc-impl.h (mbtowc): Use mbszero.
        * modules/mbtowc (Depends-on): Add mbszero.

2023-07-16  Bruno Haible  <bruno@clisp.org>

        mbszero: New module.
        * lib/wchar.in.h: Include <string.h>.
        (_GL_MBSTATE_INIT_SIZE, _GL_MBSTATE_ZERO_SIZE): New macros.
        (mbszero): New declaration.
        * lib/mbrtoc16.c: Update comments.
        * lib/mbszero.c: New file.
        * m4/wchar_h.m4 (gl_WCHAR_H_REQUIRE_DEFAULTS): Initialize
        GNULIB_MBSZERO.
        * modules/wchar (Depends-on): Add extern-inline.
        (Makefile.am): Substitute GNULIB_MBSZERO.
        * modules/mbszero: New file.

Attachment: 0001-mbszero-New-module.patch
Description: Text Data

Attachment: 0002-mbtowc-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0003-mbrtoc32-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0004-btowc-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0005-btoc32-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0006-wctomb-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0007-c32tob-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0008-mbstowcs-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0009-mbstoc32s-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0010-c32stombs-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0011-mbiter-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0012-mbuiter-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0013-mbfile-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0014-mbswidth-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0015-mbmemcasecoll-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0016-vasnprintf-vasnwprintf-Optimize-clearing-an-mbstate_.patch
Description: Text Data

Attachment: 0017-quotearg-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0018-uchar-c23-Optimize-clearing-an-mbstate_t.patch
Description: Text Data

Attachment: 0019-dfa-Optimize-clearing-an-mbstate_t.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]