[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: NSString bug with test and really dodgy patch.
From: |
Jens Ayton |
Subject: |
Re: NSString bug with test and really dodgy patch. |
Date: |
Wed, 3 Oct 2012 11:19:29 +0200 |
On Oct 3, 2012, at 09:53, Richard Frith-Macdonald <address@hidden> wrote:
>
> So I'm not sure what to do ... the C standards have changed from working with
> characters to working with bytes (which is good),
Well, no. In the C standard, "character" generally means the same thing as
"byte" (i.e., a value that can fit in a char). In point of fact, the standard
provides two conflicting normative definitions of "character" (one marked
<abstract>, the other <C>), but in the specification [f]printf() it seems
character = byte is what is meant. Both the C99 and C11 final drafts have a
footnote saying "No special provisions are made for multibyte characters."
The sentence "In no case is a partial multibyte character written." only
applies to %ls format, i.e. when converting a wchar_t* string into a possibly
multi-byte sequence for a char* string.
The closest analogue to NSString formatting is using %s in [f]wprintf(). In
this case, characters (i.e., bytes) from the string are converted "as if by
repeated calls to the mbrtowc function" (with sane initial state), and the
precision limits the number of wide characters to be written. This is
unproblematic because wchar_ts are required to be complete code units, but
Foundation unichars can be UTF-16 surrogates, so this still doesn't resolve the
issue.
In summary, "figure out what Cocoa does." :-)
--
Jens Ayton