[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#51787: Disk performance on ci.guix.gnu.org
From: |
Bengt Richter |
Subject: |
bug#51787: Disk performance on ci.guix.gnu.org |
Date: |
Wed, 22 Dec 2021 00:20:24 +0100 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
Hi Ricardo,
TL;DR: re: "Any ideas?" :)
Read this [0], and consider how file systems may be
interacting with with SSD wear-leveling algorithms.
Are some file systems dependent on successful speculative
transaction continuations, while others might slow down
waiting for signs that an SSD controller has committed one
of ITS transactions, e.g. in special cases where the user or
kernel file system wants to be sure metadata is
written/journaled for fs structural integrity, but maybe
cares less about data?
I guess this difference might show up in copying a large
file over-writing the same target file (slower) vs copying
to a series of new files (faster).
What happens if you use a contiguous file as swap space?
Or, if you use anonymous files as user data space buffers,
passing them to wayland as file handles, per its protocol,
can you do better than ignoring SSD controllers and/or
storage hardware altogether?
Reference [0] is from 2013, so probably much has happened
since then, and the paper mentions (which has probably not
gotten better), the following, referring to trade secrets
giving one manufacturer ability to produce longer-lasting
SSDs cheaper and better than others ...
--8<---------------cut here---------------start------------->8---
This means that the SSD controller is dedicated to a
single brand of NAND, and it means that the SSD maker
can’t shop around among NAND suppliers for the best price.
Furthermore, the NAND supplier won’t share this
information unless it believes that there is some compelling
reason to work the SSD manufacturer. Since there are
hundreds of SSD makers it’s really difficult to get these
companies to pay attention to you! The SSD manufacturers
that have this kind of relationship with their flash
suppliers are very rare and very special.
--8<---------------cut here---------------end--------------->8---
Well, maybe you will have to parameterize your file system
tuning with manufacturer ID and SSD controller firmware
version ;/
Mvh, Bengt
[0]
https://www.snia.org/sites/default/files/SSSITECHNOTES_HowControllersMaximizeSSDLife.pdf
On +2021-12-21 18:26:03 +0100, Ricardo Wurmus wrote:
> Today we discovered a few more things and discussed them on IRC. Here’s
> a summary.
>
> /var/cache sits on the same storage as /gnu. We mounted the 5TB ext4
> file system that’s hosted by the SAN at /mnt_test and started copying
> over /var/cache to /mnt_test/var/cache. Transfer speed was considerably
> faster (not *great*, but reasonably fast) than the copy of
> /gnu/store/trash to the same target.
>
> This confirmed our suspicions that the problem is not with the storage
> array but due to the fact that /gnu/store/trash (and also /gnu/store)
> is an extremely large, flat directory. /var/cache is not.
>
> Here’s what we do now: continue copying /var/cache to the SAN, then
> remount to serve substitutes from there. This removes some pressure
> from the file system as it will only be used for /gnu. We’re
> considering to dump the file system completely (i.e. reinstall the
> server), thereby emptying /gnu, but leaving the stash of built
> substitutes in /var/cache (hosted from the faster SAN).
>
> We could take this opportunity to reformat /gnu with btrfs, which
> performs quite a bit more poorly than ext4 but would be immune to
> defragmentation. It’s not clear that defragmentation matters here. It
> could just be that the problem is exclusively caused by having these
> incredibly large, flat /gnu/store, /gnu/store/.links, and
> /gnu/store/trash directories.
>
> A possible alternative for this file system might also be XFS, which
> performs well when presented with unreasonably large directories.
>
> It may be a good idea to come up with realistic test scenarios that we
> could test with each of these three file systems at scale.
>
> Any ideas?
>
> --
> Ricardo
>
>
>
(sorry, the top-post grew)
--
Regards,
Bengt Richter
- bug#51787: Disk performance on ci.guix.gnu.org, Mathieu Othacehe, 2021/12/20
- bug#51787: Disk performance on ci.guix.gnu.org, Ricardo Wurmus, 2021/12/20
- bug#51787: Disk performance on ci.guix.gnu.org, Mathieu Othacehe, 2021/12/20
- bug#51787: Disk performance on ci.guix.gnu.org, Ricardo Wurmus, 2021/12/21
- bug#51787: Disk performance on ci.guix.gnu.org, Leo Famulari, 2021/12/21
- bug#51787: Disk performance on ci.guix.gnu.org, Mathieu Othacehe, 2021/12/21
- bug#51787: Disk performance on ci.guix.gnu.org,
Bengt Richter <=
- bug#51787: Disk performance on ci.guix.gnu.org, Thiago Jung Bauermann, 2021/12/21
- bug#51787: Disk performance on ci.guix.gnu.org, Ricardo Wurmus, 2021/12/25
- bug#51787: Disk performance on ci.guix.gnu.org, Mathieu Othacehe, 2021/12/26
- bug#51787: Disk performance on ci.guix.gnu.org, Ricardo Wurmus, 2021/12/30
bug#51787: Disk performance on ci.guix.gnu.org, Bengt Richter, 2021/12/20