[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-recutils] Additional issues to think about for the indexes support
From: |
Jose E. Marchesi |
Subject: |
[bug-recutils] Additional issues to think about for the indexes support |
Date: |
Tue, 15 May 2012 22:08:08 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.0.92 (gnu/linux) |
Hi Michał.
These are some points I would like you to consider for the indexes
support.
- The index file structure must allow to have indexes also for the
"default" (or unnamed) record set, if any.
- Does it make sense to support secondary indexes, additionally to
primary (clustered) indexes?
- The indexes must be dense, because nothing guarantees the contents in
the recfile to be ordered in any way. Sparse indexes could be used
along with %sort:... but it would be too dangerous I think.
- Since most (all?) of the keys will be strings, it is good to consider
to apply some prefix compression to the indexes algorithms to make
them more efficient.
- 1.6 (which will be out in a couple of week) will be shipping support
for composite primary keys, i.e. keys featuring more than one field,
such as:
%rec: Transaction
%key: Provider,Consumer
It follows that the indexes must support composite keys. This is
usually achieved by using a lexicographic order, such as:
(a1, a2, ...) < (b1, b2, ...) IFF a1 < b2 OR (a1 = b1 AND a2 < b2) OR ...
- We must determine which algorithm would be better for our ordered
indexes: b-tree or b+-tree (the second is probably a better
candidate).
- I would not implement hash indexes, because they can't be used with
ranges, only with equality. Ordered indexes are much better for this
application IMO. Also, b-tree indexes can be used with prefixes of
the key, so the index can be used in much more selection expressions.
- The indexes can also be used in sorting operations, where the sorting
criteria involves an index.
--
Jose E. Marchesi http://www.jemarch.net
GNU Project http://www.gnu.org
- [bug-recutils] Additional issues to think about for the indexes support,
Jose E. Marchesi <=