lilypond-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Refactor get/set_property to take the item as first argument (issue


From: David Kastrup
Subject: Re: Refactor get/set_property to take the item as first argument (issue 573670043 by address@hidden)
Date: Fri, 10 Apr 2020 13:07:30 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Han-Wen Nienhuys <address@hidden> writes:

> On Thu, Apr 9, 2020 at 7:45 PM <address@hidden> wrote:
>>
>> On 2020/04/09 17:07:46, hanwenn wrote:
>> > I'm curious about these optimizations. Can you say more?
>>
>> Properties are currently stored in alists.  They can be stored in
>> vectors instead.  Give all grob properties a number, then have an array
>> per grob type mapping this number to an index into an array.  The first
>> number can be memoized for literal property names, turning a property
>> lookup into two indexing operations.
>>
>> That's the gist.  For Scheme code, the memoization could be replaced by
>> macro compilation, but Guilev2 byte compilation of course is sort of a
>> roadblock here, too.
>
> I think it is a great idea to store properties in vectors rather than
> alists. They take up less space, and lookups are faster.
>
> Your specific approach sounds very laborious, and I see many
> downsides, for example:
>
> * this will make it hard to add new grob-properties at runtime

It won't.  The likely long-term restriction will be that you cannot set
new properties on items (grobs/music/stream events) that existed already
at the time the new property is being defined.  Our current way of
"adding new grob-properties at runtime" is poking around in undocumented
internals in a manner that bleeds into unrelated source files.  It is
possible to set up a backward-compatibility scheme for this temporarily
(basically, before an error gets thrown, the respective
object-properties are checked, and a proper registration of the property
is performed in case those properties have been set by the user, also
meaning that they can be prevented from bleeding into the next session
as long as they are used at least once).

Adding a proper interface for registering user-added properties in a
documented and dependable manner (rather than poking around in the
internals of what happens to be the current implementation) is a
semi-unrelated project that would be independently desirable.

> * memory use: each SCM value takes 8 bytes, and there are 416 grob
> properties today, for a total of 3328 bytes. Morgenlied is single page
> of music and has 3704 grobs. So storage for the vectors (which will be
> mostly empty) will take up 12M / page. This likely will result in
> doubling the memory use of LilyPond.

You haven't understood the scheme.  A grob only contains the properties
seminal to its type.  That's why there is a double indirection.  A full
array of 416 entries is needed per grob _type_, not per grob.

> * Let's not put another roadblock in front of GUILE v2 adoption.

The Scheme angle can be tackled differently and is not part of the first
phase.

> Have you considered building a custom hash table for the properties?

I did.

> We could construct a table based on probing (so we don't need buckets
> which would reintroduce linked lists). Since symbols have unique
> addresses, we could do the hashing by casting SCM to uintptr and
> hashing a uint64 down to the table size. This means we'll still do the
> uint64 -> index on each lookup, but it's likely much faster than
> walking alists, and would cut memory for properties in half.

Hashing outside of a closed set still requires storing the actual symbol
for corroboration.  Hashing inside of a closed set amounts to the
first-stage lookup in my scheme, except that a small number guaranteed
to be unique comes out.  This lookup can be memoised and will work
permanently as opposed to hashing in a growing set.

> We could experiment with this without requiring a giant rewrite of the
> source code.

What you call a "giant rewrite" is basically this patch.  It's a purely
syntactical patch only extending rather than reducing possible
approaches and perfectly compatible with different implementations.  The
work for that patch is already done, the impact is similar to what
happened when changing around the syntax of unsmob a few times in the
past.  It will allow for prolonged work on possibly different
implementations without having to constantly deal with merge conflicts.

It does not depend on picking a particular implementation.  You
expressed interest in what I was planning to do with the opportunity
this patch opens, I answered.  I do not, however, see the point in
discussing and dissing (partially) unwritten code based on an incomplete
understanding in a handwaving discussion.

It is more efficient to discuss objections of the "you haven't thought
this through" kind on working code.

Nothing in this patch here enforces picking a particular route.  It is
just paving the ground for more work, and it does so in a manner that
isn't so much of an eye sore that it can only be tolerated in
expectation of immense future gains.

-- 
David Kastrup
My replies have a tendency to cause friction.  To help mitigating
damage, feel free to forward problematic posts to me adding a subject
like "timeout 1d" (for a suggested timeout of 1 day) or "offensive".



reply via email to

[Prev in Thread] Current Thread [Next in Thread]