[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: UTF-8 printf string formating problem
From: |
Chris Down |
Subject: |
Re: UTF-8 printf string formating problem |
Date: |
Sun, 6 Apr 2014 13:42:47 +0800 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
Jan Novak writes:
> printf string format counts bytes instead of chars, which leads to
> broken output
According to POSIX, printf's field width control is strictly in bytes,
not characters.[0]
> field width:
> An optional string of decimal digits to specify a minimum field
> width. For an output field, if the converted value has fewer
> bytes than the field width, it shall be padded on the left (or
> right, if the left-adjustment flag ( '-' ), described below, has
> been given) to the field width.
By that definition, this is expected behaviour. You will also find this
behaviour in pretty much any POSIX-y tool that uses format strings
(mawk/gawk do it).
I don't have much of an opinion on whether this behaviour is right or
wrong in the context of bash, but if this behaviour is changed, I think
it should be done under another format character, rather than changing
%s (or changing behaviour when not in POSIX-compliance mode).
0: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap05.html
pgpN174d2BU9V.pgp
Description: PGP signature