guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: File search progress: database review and question on triggers


From: Pierre Neidhardt
Subject: Re: File search progress: database review and question on triggers
Date: Wed, 12 Aug 2020 22:43:37 +0200

Julien Lepiller <julien@lepiller.eu> writes:

> Have you tried something more structured? I have some code for creating a 
> binary search tree and even compressing/decompressing strings with huffman, 
> as well as code to serialize all that (my deserialization is in Java though, 
> so not very useful to you): https://framagit.org/nani-project/nani-website
>
> See modules/nani/trie.scm for instance.
>
> The idea is to have a binary search tree whose keys are filenames and value 
> is a pointer in the file to a structure that holds data for this filerame 
> (packages and versions of guix for instance). It's super fast to look it up, 
> because you don't load the whole file, you only seek to the right position. 
> It's a lot longer to build, and not so easy to update though.

Assuming we'd have only 1 Guix generation per file, I'm not sure a trie
would help because we _always_ need to search _all_ file paths, so in
all cases we've got some 10,000+ entries to load in memory and loop over
them.

The total number of entries is the bottleneck here, both for the
database load and the individual search queries.

An obvious optimization is to memoize the database load.  This has a
significant memory cost though.

The trie could be helpful for multiple Guix generations in the same
database, but I'm not sure it warrant the increased complexity.

Thoughts?

-- 
Pierre Neidhardt
https://ambrevar.xyz/

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]