|
From: | Bruno Haible |
Subject: | Re: From wchar_t to char32_t, new module mbszero |
Date: | Sun, 16 Jul 2023 10:43:34 +0200 |
Paul Eggert wrote: > > By reading the source code of FreeBSD, NetBSD, OpenBSD, macOS, Solaris, > > and so on, I can easily determine > > - which parts of the mbstate_t mbsinit() tests, > > - which parts of the mbstate_t the various functions use. > > But in order to understand what interdependencies there are, between > > the various mbstate_t fields, and what are the assumed invariants, > > I would need to carefully read each of the mentioned files (one per > > OS and per locale type). > > Yes, and I did that for mbcel - that is, I looked at the source code for > every coding system used by mbrtoc32 on NetBSD, OpenBSD, FreeBSD, > Darwin, and DragonFly. The analysis was not as hard as one might think, > as mbrtoc32 quickly decides whether the state is initial, and mbrtoc32 > is all that matters for mbcel. > > I doubt whether other primitives like mbrlen would differ, though I did > not check this. Also, it's possible I made a mistake in analyzing > mbrtoc32, though I hope that's unlikely. I did that analysis again, more carefully than previously, and found that for macOS, FreeBSD, NetBSD, OpenBSD, Solaris, zeroing the first 12 bytes of the mbstate_t should be sufficient. (Like you said.) However, after implementing mbszero with this data and enabling its use in many places, I got test failures on NetBSD and Solaris. - On NetBSD, the minimum we need to clear is 28 bytes. - On Solaris OmniOS and OpenIndiana, the minimum we need to clear is 16 bytes. - On proprietary Solaris, the minimum we need to clear is 20 or 28 bytes (depending on 32-bit or 64-bit mode). So, clearly this is fragile stuff. I'm committing it nevertheless, since it seems that we have a good enough test coverage to detect future changes. 2023-07-16 Bruno Haible <bruno@clisp.org> dfa: Optimize clearing an mbstate_t. * lib/dfa.c (mbszero) [GAWK]: Add fallback definition. (mbs_to_wchar, lex, addtok_wc, dfaexec_main): Use mbszero. * modules/dfa (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> uchar-c23: Optimize clearing an mbstate_t. * lib/lc-charset-unicode.c (locale_encoding_to_unicode, unicode_to_locale_encoding): Use mbszero. * modules/uchar-c23 (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> quotearg: Optimize clearing an mbstate_t. * lib/quotearg.c: Include <wchar.h>. (quotearg_buffer_restyled): Use mbszero. * modules/quotearg (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> vasnprintf, vasnwprintf: Optimize clearing an mbstate_t. * lib/vasnprintf.c (VASNPRINTF): Use mbszero. * modules/vasnprintf (Depends-on): Add mbszero. * modules/vasnwprintf (Depends-on): Likewise. * modules/c-vasnprintf (Depends-on): Likewise. * modules/unistdio/u8-vasnprintf (Depends-on): Likewise. * modules/unistdio/u8-u8-vasnprintf (Depends-on): Likewise. * modules/unistdio/u16-vasnprintf (Depends-on): Likewise. * modules/unistdio/u16-u16-vasnprintf (Depends-on): Likewise. * modules/unistdio/u32-vasnprintf (Depends-on): Likewise. * modules/unistdio/u32-u32-vasnprintf (Depends-on): Likewise. * modules/unistdio/ulc-vasnprintf (Depends-on): Likewise. 2023-07-16 Bruno Haible <bruno@clisp.org> mbmemcasecoll: Optimize clearing an mbstate_t. * lib/mbmemcasecoll.c (apply_c32tolower): Use mbszero. * modules/mbmemcasecoll (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> mbswidth: Optimize clearing an mbstate_t. * lib/mbswidth.c (mbsnwidth): Use mbszero. * modules/mbswidth (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> mbfile: Optimize clearing an mbstate_t. * lib/mbfile.h (mbfile_multi_getc, mbf_init): Use mbszero. * modules/mbfile (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> mbuiter: Optimize clearing an mbstate_t. * lib/mbuiter.h: Include <wchar.h>. (mbuiter_multi_next, mbuiter_multi_copy, mbui_init): Use mbszero. * modules/mbuiter (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> mbiter: Optimize clearing an mbstate_t. * lib/mbiter.h: Include <wchar.h>. (mbiter_multi_next, mbiter_multi_copy, mbi_init): Use mbszero. * modules/mbiter (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> c32stombs: Optimize clearing an mbstate_t. * lib/c32stombs.c (c32stombs): Use mbszero. * lib/uchar.in.h (c32stombs): Likewise. * modules/c32stombs (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> mbstoc32s: Optimize clearing an mbstate_t. * lib/mbstoc32s.c (mbstoc32s): Use mbszero. * lib/uchar.in.h (mbstoc32s): Likewise. * modules/mbstoc32s (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> mbstowcs: Optimize clearing an mbstate_t. * lib/mbstowcs.c (mbstowcs): Use mbszero. * modules/mbstowcs (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> c32tob: Optimize clearing an mbstate_t. * lib/c32tob.c (c32tob): Use mbszero. * modules/c32tob (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> wctomb: Optimize clearing an mbstate_t. * lib/wctomb-impl.h (wctomb): Use mbszero. * modules/wctomb (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> btoc32: Optimize clearing an mbstate_t. * lib/btoc32.c: Include <wchar.h>. (btoc32): Use mbszero. * modules/btoc32 (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> btowc: Optimize clearing an mbstate_t. * lib/btowc.c (btowc): Use mbszero. * modules/btowc (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> mbrtoc32: Optimize clearing an mbstate_t. * lib/mbrtoc32.c (mbrtoc32): Use mbszero. * modules/mbrtoc32 (Depends-on): Add mbsinit, mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> mbtowc: Optimize clearing an mbstate_t. * lib/mbtowc-impl.h (mbtowc): Use mbszero. * modules/mbtowc (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <bruno@clisp.org> mbszero: New module. * lib/wchar.in.h: Include <string.h>. (_GL_MBSTATE_INIT_SIZE, _GL_MBSTATE_ZERO_SIZE): New macros. (mbszero): New declaration. * lib/mbrtoc16.c: Update comments. * lib/mbszero.c: New file. * m4/wchar_h.m4 (gl_WCHAR_H_REQUIRE_DEFAULTS): Initialize GNULIB_MBSZERO. * modules/wchar (Depends-on): Add extern-inline. (Makefile.am): Substitute GNULIB_MBSZERO. * modules/mbszero: New file.
0001-mbszero-New-module.patch
Description: Text Data
0002-mbtowc-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0003-mbrtoc32-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0004-btowc-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0005-btoc32-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0006-wctomb-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0007-c32tob-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0008-mbstowcs-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0009-mbstoc32s-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0010-c32stombs-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0011-mbiter-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0012-mbuiter-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0013-mbfile-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0014-mbswidth-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0015-mbmemcasecoll-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0016-vasnprintf-vasnwprintf-Optimize-clearing-an-mbstate_.patch
Description: Text Data
0017-quotearg-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0018-uchar-c23-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0019-dfa-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
[Prev in Thread] | Current Thread | [Next in Thread] |