[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#65720: Guile-Git-managed checkouts grow way too much
From: |
Ludovic Courtès |
Subject: |
bug#65720: Guile-Git-managed checkouts grow way too much |
Date: |
Tue, 19 Sep 2023 00:35:28 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) |
Ludovic Courtès <ludo@gnu.org> skribis:
> As reported by Tobias on IRC (in the context of ‘hpcguix-web’),
> checkouts managed by Guile-Git appear to grow beyond reason. As an
> example, here’s the same ‘.git’ managed with Guile-Git and with Git:
>
> $ du -hs
> ~/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq
> 6.7G
> /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq
> $ du -hs .git
> 517M .git
More data… The biggest file in that repo is a pack that was created
when that repo was first cloned (Aug. 2021):
--8<---------------cut here---------------start------------->8---
$ du
/home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/*
|sort -k1 -n| tail -3
44272
/home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-3c2f1857501b01c321bc67ba1f30704deb9e18e9.pack
47272
/home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-30d5b35ad14a8398464e49e224811b162f673d66.pack
191492
/home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-d39507858782209d1ad87e389e4dffd4b6ff7ea2.pack
$ ls -l
/home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-d39507858782209d1ad87e389e4dffd4b6ff7ea2.pack
-r--r--r-- 1 ludo users 196079671 Aug 9 2021
/home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-d39507858782209d1ad87e389e4dffd4b6ff7ea2.pack
$ ls -ld
/home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/config
-rw-r--r-- 1 ludo users 266 Aug 9 2021
/home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/config
--8<---------------cut here---------------end--------------->8---
The pack starts with things from Aug. 2021:
--8<---------------cut here---------------start------------->8---
$ git show-index < pack-d39507858782209d1ad87e389e4dffd4b6ff7ea2.idx|sort -k1
-n|head -3
12 30289f4d4638452520f52c1a36240220d0d940ff (852d8cb3)
927 d7ffc535c52f49177a8e5553569cdb1e321b5bc6 (2007c5d0)
1800 0a379de3249d5e9ff66fb404f7e5aa8ce2cb3d24 (b1e69aa4)
$ git show 30289f4d4638452520f52c1a36240220d0d940ff
commit 30289f4d4638452520f52c1a36240220d0d940ff
Author: Milkey Mouse <milkeymouse@meme.institute>
Date: Sun Aug 8 22:15:40 2021 -0700
[…]
--8<---------------cut here---------------end--------------->8---
… and at the bottom (large offsets) it contains very old blogs from the
Nix repo that somehow made it here.
I figured we still had a ‘nix’ branch from the early days, that contains
the history of Nix. I’ve now removed it, which helps a bit:
--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> ,use(git)
scheme@(guile-user)> ,t (clone "https://git.savannah.gnu.org/git/guix.git"
"/tmp/guix")
$5 = #<git-repository 91a7b0>
;; 600.534529s real time, 435.260926s run time. 0.000000s spent in GC.
scheme@(guile-user)> ,t (clone "https://git.savannah.gnu.org/git/guix.git"
"/tmp/guix-after-removing-nix-branch")
$6 = #<git-repository 4465a50>
;; 420.321511s real time, 398.772963s run time. 0.000000s spent in GC.
--8<---------------cut here---------------end--------------->8---
… and more importantly:
--8<---------------cut here---------------start------------->8---
$ du -hs /tmp/guix/.git
373M /tmp/guix/.git
$ du -hs /tmp/guix-after-removing-nix-branch/.git
362M /tmp/guix-after-removing-nix-branch/.git
--8<---------------cut here---------------end--------------->8---
Anyway, what seems to happen is that every pull (every call to
‘remote-fetch’) creates a new pack (see ‘git_fetch_download_pack’ in
libgit2), which becomes inefficient in the long run (lots of small
poorly-compressed packs). That’s at least one possible explanation.
To be continued…
Ludo’.
- bug#65720: Guile-Git-managed checkouts grow way too much, (continued)
- bug#65720: Guile-Git-managed checkouts grow way too much, Simon Tournier, 2023/09/13
- bug#65720: Guile-Git-managed checkouts grow way too much, Simon Tournier, 2023/09/06
- bug#65720: Guile-Git-managed checkouts grow way too much, Ludovic Courtès, 2023/09/08
- bug#65720: Guile-Git-managed checkouts grow way too much, Simon Tournier, 2023/09/09
- bug#65720: Guile-Git-managed checkouts grow way too much, Csepp, 2023/09/11
- bug#65720: Guile-Git-managed checkouts grow way too much, Ludovic Courtès, 2023/09/11
bug#65720: Guile-Git-managed checkouts grow way too much, Simon Tournier, 2023/09/05
bug#65720: Guile-Git-managed checkouts grow way too much, Jelle Licht, 2023/09/06
bug#65720: Guile-Git-managed checkouts grow way too much, Ludovic Courtès, 2023/09/05
bug#65720: Guile-Git-managed checkouts grow way too much,
Ludovic Courtès <=