[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
`texindex` output depends on locale settings
From: |
Werner LEMBERG |
Subject: |
`texindex` output depends on locale settings |
Date: |
Sun, 06 Nov 2022 10:02:44 +0000 (UTC) |
[texindex (GNU texinfo) 6.8dev]
[GNU Awk 4.2.1, API: 2.0]
[openSUSE Leap 15.4]
There are two bugs with texindex, making it basically unusable for
everything except English as the main document language. For the
report below, here is an input file.
```
\input texinfo.tex
@documentencoding UTF-8
@documentlanguage ca
@findex a
@findex à
@findex u
@findex ù
@printindex fn
@bye
```
* The first, really severe bug is that the resulting output is
completely broken if `texindex` is called with `LANG=C`. Saying
```
LANG=C texi2pdf sort-ca.texi
```
creates the following `.fns` output
```
\initial {0xc3}
\entry{\code {à}}{1}
\entry{\code {ù}}{1}
\initial {A}
\entry{\code {a}}{1}
\initial {U}
\entry{\code {u}}{1}
```
As can be seen, the `\initial` line contains a single byte (where
'0xc3' is a real byte), which suprisingly doesn't make pdftex abort,
but both xetex and luatex stop with errors. I have to use a UTF-8
locale like `en_US.utf8` to get decent output.
I consider it very bad that `texindex` is locale-dependent. IMHO
the proper solution is to make `texinfo.tex` emit a document
encoding statement to the (unsorted) index file that in turn gets
acknowledged by `texindex`.
* While `texindex` is sensitive to the locale regarding the input
encoding, it isn't for collation: any `LANG` or `LC_COLLATE` setting
gets ignored. Similarly, it ignores the `@documentlanguage`
instruction to derive a sorting order. For example, the Catalan
order for the above example should be 'aàuù', however, in the output
it is sorted as `àùau'.
The proper fix would be to make `texinfo.tex` emit a document
language statement to the (unsorted) index file that in turn gets
acknowledged by `texindex`.
Werner
- `texindex` output depends on locale settings,
Werner LEMBERG <=
- Re: `texindex` output depends on locale settings, arnold, 2022/11/06
- Re: `texindex` output depends on locale settings, Eli Zaretskii, 2022/11/06
- Re: `texindex` output depends on locale settings, Werner LEMBERG, 2022/11/06
- Re: `texindex` output depends on locale settings, Eli Zaretskii, 2022/11/06
- Re: `texindex` output depends on locale settings, Werner LEMBERG, 2022/11/06
- Re: `texindex` output depends on locale settings, Eli Zaretskii, 2022/11/06