bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#50336: Width format specifier is calculated wrong for nb_NO locale


From: Pádraig Brady
Subject: bug#50336: Width format specifier is calculated wrong for nb_NO locale
Date: Thu, 2 Sep 2021 16:14:54 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0

tag 50336 notabug
close 50336
stop

On 02/09/2021 13:49, Carl-Erik Kopseng wrote:
I just noticed that the width specifier for numeric parameters does some
weird calculations when the specified locale is `nb_NO.utf8`. For instance,
the number formatting rules for US and NO both result in the same number of
characters (with ' ' instead of ','), but the Norwegian version lacks two
spaces in the padded output. This must be a bug, no?

```
$ LC_NUMERIC=en_US.utf8 printf "%'7d%s\n" 1234 XXX
   1,234XXX

$ LC_NUMERIC=nb_NO.utf8 printf "%'7d%s\n" 1234 XXX
1 234XXX
```

Note one must be careful with printf as there is a
shell builtin often in consideration here.
That's not at issue here tough as both the shell
and coreutils call down to libc printf implementation.

The particular issue is the grouping char used
in the nb_NO.utf8 locale is multi-byte.
Specifically: e2 80 af
So that character counts as 3 bytes,
and the printf implementation is counting bytes,
not characters, or display cells.

Given the usual consideration is display width,
it probably should be considering display cells,
but that's an issue for libc, not coreutils.

Note coreutils does need to handle alignment in various places,
and for that it uses the following module to more
generally handle this:
https://github.com/coreutils/coreutils/blob/master/gl/lib/mbsalign.c

closing this as not a coreutils specific bug.

cheers,
Pádraig





reply via email to

[Prev in Thread] Current Thread [Next in Thread]