[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: New ABI NSConstantString
From: |
Richard Frith-Macdonald |
Subject: |
Re: New ABI NSConstantString |
Date: |
Sun, 1 Apr 2018 14:06:30 +0100 |
> On 1 Apr 2018, at 12:21, David Chisnall <address@hidden> wrote:
>
> On 1 Apr 2018, at 11:36, Fred Kiefer <address@hidden> wrote:
>>
>> Wouldn’t the most useful structure be the one we already use for GSString?
>
> That’s certainly a good starting point!
>
>>
>> @interface GSString : NSString
>> {
>> @public
>> GSCharPtr _contents;
>> unsigned int _count;
>
> Is this the number of bytes or the number of characters? I imagine that both
> are useful.
That's the character count.
>> struct {
>> unsigned int wide: 1; // 16-bit characters in string?
>> unsigned int owned: 1; // Set if the instance owns the
>> // _contents buffer
>
> Owned is presumably redundant for constant strings.
Yep. In a constant string you could just consider it a bit reserved for
mutable strings.
>> unsigned int unused: 2;
>> unsigned int hash: 28;
>> } _flags;
>> }
>> @end
>>
>> Of course constant strings won’t require the hidden reference count that
>> come with all ObjC objects. But apart from that it seems to be a more useful
>> structure. Storing the length with the string should speed up some common
>> operations and 28 bit of hash should still be enough. There are even two
>> unused bits in the flags that could encode the specific hash function.
>
> I’d like to have more than 2 bits spare for future expansion. The current
> NXConstantString structure is now 30 years old, and I think there have been
> several times in the past when it would have been nice to add other things to
> it if we’d had a good way of maintaining compatibility.
>
> This structure does have the advantage that it doesn’t need padding on any
> 32- or 64-bit architectures.
> Do we have any measurements to tell us that 28 bits is enough for the hash?
I don't think so, but with a good hash that gets us over a hundred million
strings held efficiently in a set/dictionary, which seems plenty for now.
However, if the idea is to future-proof things in the ABI, perhaps 28bits is
not enough.
> At some point, I’d like to move the hash implementation for NSString to
> MurmurHash3, which should give better distribution and is very fast on modern
> hardware.
Yes. GNUstep-base has MurmurHash3 support, and perhaps it's time it was made
the default.
> I’m also a bit nervous about using C bitfields in static data structures,
> because their layout is ABI dependent (and on some platforms can change
> between compiler versions).
I wasn't aware of that ... it would make sense for your new ABI, when
individual bits, to have them specified as particular bits rather than as a
bitfield, avoiding the possibility of problems with different compilers.
I don't think you should feel constrained to follow the current layout ... IMO
the current one is good for years yet but probably not for decades.
However, I do think that it's more sensible to have pointer, count, hash, and
flags similar to the current GNUstep layout than to follow Apple (and to bear
in mind that its convenient for mutable strings to share a layout with constant
ones).
- New ABI NSConstantString, David Chisnall, 2018/04/01
- Re: New ABI NSConstantString, Fred Kiefer, 2018/04/01
- Re: New ABI NSConstantString, Stefan Bidigaray, 2018/04/05
- Re: New ABI NSConstantString, David Chisnall, 2018/04/05
- Re: New ABI NSConstantString, Stefan Bidigaray, 2018/04/05
- Re: New ABI NSConstantString, David Chisnall, 2018/04/05
- Re: New ABI NSConstantString, Stefan Bidigaray, 2018/04/05
- Re: New ABI NSConstantString, David Chisnall, 2018/04/06