[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
GSUnicodeString and NSString disagree on rangeOfComposedCharacterSequenc
From: |
David Chisnall |
Subject: |
GSUnicodeString and NSString disagree on rangeOfComposedCharacterSequenceAtIndex: |
Date: |
Sat, 7 Apr 2018 19:51:25 +0100 |
Hello the list,
I am testing out a new version of the compiler / runtime that is producing
NSConstantString instances with UTF-16 data. I have currently disabled a lot
of the NSConstantString optimisations, on the basis of ‘make it work then make
it fast’ and I’m still seeing quite a lot of test failures. The most recent
ones seem to come from the fact that GSUnicodeString’s implementation of
rangeOfComposedCharacterSequenceAtIndex: calls rangeOfSequence_u(), which
returns a different range to NSString’s implementation.
I have ls (an GSUnicodeString) and indianLong (a UTF-16 NSConstantString) from
the NSString/test00.m. If I call -getCharacters:range: on both, then I get the
same set of characters for [indianLong length] characters. This is as
expected. When searching for indianLong in ls, it is not found. Sticking in a
lot of debugging code, I eventually tracked it down to this disagreement and
when I comment out GSUnicodeString’s implementation of
rangeOfComposedCharacterSequenceAtIndex: so that it uses the superclass
implementation then this test passes.
Please can someone who understands these bits of exciting unicode logic take a
look and see if there’s any reason for the disagreement?
I’m now hitting a failure in the unichar_const tests, because for some reason a
GSMutableString and a (UTF-16) NSConstantString are not comparing equal, in
spite of having the same hash, the same length, and the same values for both
characters...
David
- GSUnicodeString and NSString disagree on rangeOfComposedCharacterSequenceAtIndex:,
David Chisnall <=