emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [OFFTOPIC] size of issue tracker


From: Tassilo Horn
Subject: Re: [OFFTOPIC] size of issue tracker
Date: Mon, 13 May 2019 20:59:51 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

Eli Zaretskii <address@hidden> writes:

>> > Why does Forge insist downloading the entire DB to the local
>> > machine?  That sounds like something that wouldn't scale well.
>> 
>> I don't know why it does so, but it sounds like a good idea to me, so
>> you can later use them when you're offline.
>
> If the DB is small enough, sure.  It sounds like somehow it isn't the
> case with Forge and GitLab.

My database (~/.emacs.d/forge-database.sqlite) is currently 38 MB, and
it contains the contents of 4 different projects from which gtk with its
1900 issues and 900 MRs is by far the largest one.  There's a repository
table from which you can join your way to the issues, merge/pull
requests, their comments, labels, and assignees.

That's the current schema if you are interested in details.

--8<---------------cut here---------------start------------->8---
sqlite> .schema
CREATE TABLE repository (class NOT NULL, id NOT NULL PRIMARY KEY, forge_id , 
forge , owner , name , apihost , githost , remote , sparse_p , created , 
updated , pushed , parent , description , homepage , default_branch , 
archived_p , fork_p , locked_p , mirror_p , private_p , issues_p , wiki_p , 
stars , watchers , assignees DEFAULT eieio_unbound, forks DEFAULT 
eieio_unbound, issues DEFAULT eieio_unbound, labels DEFAULT eieio_unbound, 
revnotes DEFAULT eieio_unbound, pullreqs DEFAULT eieio_unbound);
CREATE TABLE assignee (repository NOT NULL, id NOT NULL PRIMARY KEY, login , 
name , forge_id , FOREIGN KEY (repository) REFERENCES repository (id) ON DELETE 
CASCADE);
CREATE TABLE fork (parent NOT NULL, id NOT NULL PRIMARY KEY, owner , name , 
FOREIGN KEY (parent) REFERENCES repository (id) ON DELETE CASCADE);
CREATE TABLE issue (class NOT NULL, id NOT NULL PRIMARY KEY, repository , 
number , state , author , title , created , updated , closed , unread_p , 
locked_p , milestone , body , assignees DEFAULT eieio_unbound, cards DEFAULT 
eieio_unbound, edits DEFAULT eieio_unbound, labels DEFAULT eieio_unbound, 
participants DEFAULT eieio_unbound, posts DEFAULT eieio_unbound, reactions 
DEFAULT eieio_unbound, timeline DEFAULT eieio_unbound, FOREIGN KEY (repository) 
REFERENCES repository (id) ON DELETE CASCADE);
CREATE TABLE issue_assignee (issue NOT NULL, id NOT NULL, FOREIGN KEY (issue) 
REFERENCES issue (id) ON DELETE CASCADE);
CREATE TABLE issue_label (issue NOT NULL, id NOT NULL, FOREIGN KEY (issue) 
REFERENCES issue (id) ON DELETE CASCADE, FOREIGN KEY (id) REFERENCES label (id) 
ON DELETE CASCADE);
CREATE TABLE issue_post (class NOT NULL, id NOT NULL PRIMARY KEY, issue , 
number , author , created , updated , body , edits DEFAULT eieio_unbound, 
reactions DEFAULT eieio_unbound, FOREIGN KEY (issue) REFERENCES issue (id) ON 
DELETE CASCADE);
CREATE TABLE label (repository NOT NULL, id NOT NULL PRIMARY KEY, name , color 
, description , FOREIGN KEY (repository) REFERENCES repository (id) ON DELETE 
CASCADE);
CREATE TABLE notification (class NOT NULL, id NOT NULL PRIMARY KEY, repository 
, forge , reason , unread_p , last_read , updated , title , type , topic , url 
, FOREIGN KEY (repository) REFERENCES repository (id) ON DELETE CASCADE);
CREATE TABLE pullreq (class NOT NULL, id NOT NULL PRIMARY KEY, repository , 
number , state , author , title , created , updated , closed , merged , 
unread_p , locked_p , editable_p , cross_repo_p , base_ref , base_repo , 
head_ref , head_user , head_repo , milestone , body , assignees DEFAULT 
eieio_unbound, cards DEFAULT eieio_unbound, commits DEFAULT eieio_unbound, 
edits DEFAULT eieio_unbound, labels DEFAULT eieio_unbound, participants DEFAULT 
eieio_unbound, posts DEFAULT eieio_unbound, reactions DEFAULT eieio_unbound, 
review_requests DEFAULT eieio_unbound, reviews DEFAULT eieio_unbound, timeline 
DEFAULT eieio_unbound, FOREIGN KEY (repository) REFERENCES repository (id) ON 
DELETE CASCADE);
CREATE TABLE pullreq_assignee (pullreq NOT NULL, id NOT NULL, FOREIGN KEY 
(pullreq) REFERENCES pullreq (id) ON DELETE CASCADE);
CREATE TABLE pullreq_label (pullreq NOT NULL, id NOT NULL, FOREIGN KEY 
(pullreq) REFERENCES pullreq (id) ON DELETE CASCADE, FOREIGN KEY (id) 
REFERENCES label (id) ON DELETE CASCADE);
CREATE TABLE pullreq_post (class NOT NULL, id NOT NULL PRIMARY KEY, pullreq , 
number , author , created , updated , body , edits DEFAULT eieio_unbound, 
reactions DEFAULT eieio_unbound, FOREIGN KEY (pullreq) REFERENCES pullreq (id) 
ON DELETE CASCADE);
CREATE TABLE revnote (class NOT NULL, id NOT NULL PRIMARY KEY, repository , 
"commit" , file , line , author , body , FOREIGN KEY (repository) REFERENCES 
repository (id) ON DELETE CASCADE);
--8<---------------cut here---------------end--------------->8---

Fetching all new 19 issues & 10 MRs since my initial 2-hour fetch
yesterday took about 3 minutes.

I think the bottleneck here is the current GitLab API.  If I find some
time, I can re-run the procedure with an even bigger GitHub project.
I've read GitHub has a more efficient GraphQL API for this stuff, and
AFAIK, GitLab is or will implement something similar, so this slowness
might be resolved in the mid-term future.

If you have a preference for some project hosted on GitHub (e.g.,
because it has about the same numbers of issues as we have), drop me a
mail.

Bye,
Tassilo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]