gnustep-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New ABI NSConstantString


From: Richard Frith-Macdonald
Subject: Re: New ABI NSConstantString
Date: Sun, 1 Apr 2018 14:06:30 +0100


> On 1 Apr 2018, at 12:21, David Chisnall <address@hidden> wrote:
> 
> On 1 Apr 2018, at 11:36, Fred Kiefer <address@hidden> wrote:
>> 
>> Wouldn’t the most useful structure be the one we already use for GSString?
> 
> That’s certainly a good starting point!
> 
>> 
>> @interface GSString : NSString
>> {
>> @public
>> GSCharPtr _contents;
>> unsigned int _count;
> 
> Is this the number of bytes or the number of characters?  I imagine that both 
> are useful.

That's the character count.

>> struct {
>>   unsigned int       wide: 1;        // 16-bit characters in string?
>>   unsigned int       owned: 1;       // Set if the instance owns the
>>                                      // _contents buffer
> 
> Owned is presumably redundant for constant strings.

Yep.  In a constant string you could just consider it a bit reserved for 
mutable strings.

>>   unsigned int       unused: 2;
>>   unsigned int       hash: 28;
>> } _flags;
>> }
>> @end
>> 
>> Of course constant strings won’t require  the hidden reference count that 
>> come with all ObjC objects. But apart from that it seems to be a more useful 
>> structure. Storing the length with the string should speed up some common 
>> operations and 28 bit of hash should still be enough. There are even two 
>> unused bits in the flags that could encode the specific hash function.
> 
> I’d like to have more than 2 bits spare for future expansion.  The current 
> NXConstantString structure is now 30 years old, and I think there have been 
> several times in the past when it would have been nice to add other things to 
> it if we’d had a good way of maintaining compatibility.
> 
> This structure does have the advantage that it doesn’t need padding on any 
> 32- or 64-bit architectures.


> Do we have any measurements to tell us that 28 bits is enough for the hash?

I don't think so, but with a good hash that gets us over a hundred million 
strings held efficiently in a set/dictionary, which seems plenty for now.
However, if the idea is to future-proof things in the ABI, perhaps 28bits is 
not enough.

> At some point, I’d like to move the hash implementation for NSString to 
> MurmurHash3, which should give better distribution and is very fast on modern 
> hardware.

Yes.  GNUstep-base has MurmurHash3 support, and perhaps it's time it was made 
the default.

> I’m also a bit nervous about using C bitfields in static data structures, 
> because their layout is ABI dependent (and on some platforms can change 
> between compiler versions).
I wasn't aware of that ... it would make sense for your new ABI, when 
individual bits, to have them specified as particular bits rather than as a 
bitfield, avoiding the possibility of problems with different compilers.

I don't think you should feel constrained to follow the current layout ... IMO 
the current one is good for years yet but probably not for decades.
However, I do think that it's more sensible to have pointer, count, hash, and 
flags similar to the current GNUstep layout than to follow Apple (and to bear 
in mind that its convenient for mutable strings to share a layout with constant 
ones).






reply via email to

[Prev in Thread] Current Thread [Next in Thread]