emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sqlite3


From: Dmitry Gutov
Subject: Re: sqlite3
Date: Tue, 14 Dec 2021 20:32:59 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0

On 14.12.2021 19:43, Lars Ingebrigtsen wrote:
Dmitry Gutov <dgutov@yandex.ru> writes:

But a "proper" database might give other advantages like a faster
search in the loaded data (unless it's already "indexed" by using hash
tables everywhere where they could be used).

Or being able to read the data without loading the whole file into
memory. Which, for certain scenarios and data sets, might be a bigger
advantage than faster writes.

Here's a matrix of advantages and disadvantages to three approaches:
sqlite, one-file-per-value, and
one-file-with-a-hash-table-with-several/all-values:
sqlite files hash
Read/write value speed     ⚄       ⚅     ⚀
Read/write value mem       ⚅       ⚄     ⚀
List all values speed      ⚅       ⚀     ⚅
List all values mem        ⚃       ⚁     ⚃
Ease of moving around      ⚄       ⚀     ⚅

I'm not 100% sure how to interpret (is a higher value for "mem" better or worse?), but it seems like, at least, for the original scenario of having large data sets sqlite might still be optimal.

But it turns out that sqlite3 is actually slower for this particular use
case than just writing the data to a file (i.e., using the file system
as the database; one file per value).  So multisession.el now offers two
backends (`files' and `sqlite'), and defaults to `files'.

Does the latter scenario use as many files as you do 'COMMIT' in the
former scenario?

No, if you (cl-incf (multisession-value foo)) you'll get one COMMIT per
time, but there'll only be one foo.value file (at a time).

OK, but it's still the same number of writes, more or less? IO is the slow part of most programs, and when it comes to an SQL database, it might have to do an update in multiple places (e.g. the data and the index), rather than do one smooth write.

Might also depend on the size of the write (how big the values are).

Speaking of the latter scheme, I might be missing some details, but sqlite should provide better atomicity guarantess in the same of being interrupted mid-write. Like, if we have one-file-per-value, then the total list of keys must live somewhere, and they can get desynchronized.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]