[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: File search progress: database review and question on triggers
From: |
Pierre Neidhardt |
Subject: |
Re: File search progress: database review and question on triggers |
Date: |
Wed, 12 Aug 2020 21:10:08 +0200 |
I've done some benchmarking.
1. I tried to fine-tune the SQL a bit:
- Open/close the database only once for the whole indexing.
- Use "insert" instead of "insert or replace".
- Use numeric ID as key instead of path.
Result: Still around 15-20 minutes to build. Switching to numeric
indices shrank the database by half.
2. I've tried with the following naive 1-file-per-line format:
--8<---------------cut here---------------start------------->8---
"/gnu/store/97p5gvb7jglmn9jpgwwf5al1798wi61f-acl-2.2.53//share/man/man5/acl.5.gz"
"/gnu/store/97p5gvb7jglmn9jpgwwf5al1798wi61f-acl-2.2.53//share/man/man3/acl_add_perm.3.gz"
"/gnu/store/97p5gvb7jglmn9jpgwwf5al1798wi61f-acl-2.2.53//share/man/man3/acl_calc_mask.3.gz"
...
--8<---------------cut here---------------end--------------->8---
Result: Takes between 20 and 2 minutes to complete and the result is
32 MiB big. (I don't know why the timing varies.)
A string-contains filter takes less than 1 second.
A string-match (regex) search takes some 3 seconds (Ryzen 5 @ 3.5
GHz). I'm not sure if we can go faster. I need to measure the time
SQL takes for a regexp match.
Question: Any idea how to load the database as fast as possible? I
tried the following, it takes 1.5s on my machine:
--8<---------------cut here---------------start------------->8---
(define (load-textual-database)
(call-with-input-file %textual-db
(lambda (port)
(let loop ((line (get-line port))
(result '()))
(if (string? line)
(loop (get-line port) (cons line result))
result)))))
--8<---------------cut here---------------end--------------->8---
Cheers!
--
Pierre Neidhardt
https://ambrevar.xyz/
signature.asc
Description: PGP signature
- File search progress: database review and question on triggers, Pierre Neidhardt, 2020/08/10
- Re: File search progress: database review and question on triggers, Mathieu Othacehe, 2020/08/11
- Re: File search progress: database review and question on triggers, Ricardo Wurmus, 2020/08/11
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/08/11
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/08/11
- Re: File search progress: database review and question on triggers, Ricardo Wurmus, 2020/08/11
- Re: File search progress: database review and question on triggers,
Pierre Neidhardt <=
- Re: File search progress: database review and question on triggers, Julien Lepiller, 2020/08/12
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/08/12
- Re: File search progress: database review and question on triggers, Julien Lepiller, 2020/08/12
- Re: File search progress: database review and question on triggers, Ricardo Wurmus, 2020/08/12
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/08/13
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/08/13
- Re: File search progress: database review and question on triggers, Ricardo Wurmus, 2020/08/13
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/08/13
- Re: File search progress: database review and question on triggers, Hartmut Goebel, 2020/08/15
- Re: File search progress: database review and question on triggers, Bengt Richter, 2020/08/15