gnustep-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GSUnicodeString and NSString disagree on rangeOfComposedCharacterSequenc


From: David Chisnall
Subject: GSUnicodeString and NSString disagree on rangeOfComposedCharacterSequenceAtIndex:
Date: Sat, 7 Apr 2018 19:51:25 +0100

Hello the list,

I am testing out a new version of the compiler / runtime that is producing 
NSConstantString instances with UTF-16 data.  I have currently disabled a lot 
of the NSConstantString optimisations, on the basis of ‘make it work then make 
it fast’ and I’m still seeing quite a lot of test failures.  The most recent 
ones seem to come from the fact that GSUnicodeString’s implementation of 
rangeOfComposedCharacterSequenceAtIndex: calls rangeOfSequence_u(), which 
returns a different range to NSString’s implementation.

I have ls (an GSUnicodeString) and indianLong (a UTF-16 NSConstantString) from 
the NSString/test00.m. If I call -getCharacters:range: on both, then I get the 
same set of characters for [indianLong length] characters.  This is as 
expected.  When searching for indianLong in ls, it is not found.  Sticking in a 
lot of debugging code, I eventually tracked it down to this disagreement and 
when I comment out GSUnicodeString’s implementation of 
rangeOfComposedCharacterSequenceAtIndex: so that it uses the superclass 
implementation then this test passes.

Please can someone who understands these bits of exciting unicode logic take a 
look and see if there’s any reason for the disagreement?

I’m now hitting a failure in the unichar_const tests, because for some reason a 
GSMutableString and a (UTF-16) NSConstantString are not comparing equal, in 
spite of having the same hash, the same length, and the same values for both 
characters...

David




reply via email to

[Prev in Thread] Current Thread [Next in Thread]