[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[lmi] More robust multibooting
From: |
Greg Chicares |
Subject: |
[lmi] More robust multibooting |
Date: |
Tue, 10 Sep 2019 23:46:03 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 |
Vadim--I'm writing this mainly as documentation. I do ask a couple
of questions, which searching for the word 'question' will find. But
feel free to comment on anything else if you're so inclined.
For quite a while, I had been managing a multiboot system as follows:
- set up a dedicated boot partition
- mount that as /boot in every installation's /etc/fstab
- when debian issues a new 'stable' release, create a new partition
for it, and do a fresh installation from scratch there
I don't worry about disk space: I've been using this computer for
almost four years, and haven't yet managed to consume even a hundred
gigabytes, which would be seven percent of the 1500 available. I
figured it would be most robust to leave a stable old installation
in place when installing a new one, in case the new one doesn't work.
Then I tried to install fedora (so that I'd have an optional system
that's more similar to the RHEL server in the office). I figured
nothing could go wrong: I've had OpenBSD installed for years, and
fedora's less dissimilar from debian. But in retrospect I can see
that OpenBSD's dissimilarity was a virtue: I had to chainload it in
grub, so it couldn't mess up grub.
You see where this is going. When I install a new GNU/Linux system,
by default it wants to take ownership of grub. Long story short, my
debian system became unbootable. I tried rebooting it from several
different live CDs (debian; 'grub rescue disk'; 'rescatux'), but
none of those actually worked in this case--at worst, they failed
utterly, and at best, they booted into a partly-working system.
Now, since debian had promoted 'buster' to 'stable', I figured it
was time to upgrade anyway, so I installed 'buster' on its own
partition. When I chrooted into the old 'stretch' system, it mostly
worked, but not quite: notably, my trackball didn't work, and it's
rather difficult to use xfce with no pointing device.
Then I added a new installation of 'stretch', figuring that at
worst I could just 'dd' the old 'stretch' system onto it. By this
time, I had gathered that a shared /boot partition was part of the
problem, so I installed this new system without any bootloader.
That worked just fine.
But by now I had dug so deeply into grub that I wanted to find out
how to make it "just work". Here's the answer I came up with:
$cat /etc/grub.d/40_custom
menuentry 'Debian GNU/Linux 9 (stretch SIMPLE) (on /dev/sda1)' {
insmod part_msdos
insmod ext2
set root='hd0,msdos1'
search --no-floppy --label --set=root --label stretch
echo 'Loading stretch (simply) ...'
linux /vmlinuz root=LABEL=stretch ro intel_iommu=on
libata.force=noncqtrim
echo 'Loading initial ramdisk ...'
initrd /initrd.img
}
This is starkly different from the menu entries written by
'update-grub'. Most notably, the boot partition isn't mentioned
here at all. This installation (which is the 'stretch' system I've
been using for years) is on /dev/sda1 = (hd0,msdos1), and this
40_custom stanza mentions no other drive or partition at all.
And it has the great virtue of actually working.
Of course, I went back and did some cleanup. First, I commented
out the old /boot entry in this installation's /etc/fstab. Then I
fixed up its swap file (details below) and recreated its initrd.
But now it seems to work perfectly.
I conjecture that adding a fresh installation of 'stretch' made
recovery more difficult, because versioned files like
vmlinuz-4.9.0-9-amd64
initrd.img-4.9.0-9-amd64
were written to the same /boot by different installations.
Normally, I guess, no one would do what I did, and collisions
wouldn't occur, but in this case I suspect they did collide.
Here are some things I've learned.
First of all, UUIDs are really not such a great idea. True, they
were helpful on my old supermicro where I often swapped rotary
hard disks in and out: that is, they're less impermanent than
device names like /dev/sda. But UUIDs can change, for reasons
that I don't necessarily understand. The debian installer, for
instance, reformats any swap partitions it finds, resulting in a
different UUID. I now think labels should be used instead: they're
less likely to change; any software that incidentally alters them
is more likely to erase them altogether, which may be inconvenient
but is easily seen and fixed; and they're easier to read and type.
This is a single-user system with 'hibernate' and 'suspend' both
inhibited, so it should be perfectly fine to share a swap partition
across all installed linuces. However, coping with UUID changes is
not as simple as changing /etc/fstab: there's a swap UUID in
/etc/initramfs-tools/conf.d/resume
which matters at boot time even though I never hibernate or suspend,
and setting its contents to 'RESUME=' or 'RESUME=NONE' doesn't work:
apparently it's necessary to insert the updated UUID, and then of
course 'update-initramfs -u'. Questions:
- Does it even make any sense to use swap, on a 32-hyperthread box
with 64GB of RAM?
- If swap is still useful, wouldn't a swapfile be better than a
swap partition, given that partitions have fragile UUIDs,
while a swapfile can be local?
Searching online yields no clearly definitive answer. Here's one of
the better-written articles:
https://haydenjames.io/linux-performance-almost-always-add-swap-space/
which suggests two benefits:
- Pages that are hardly ever used get swapped out, liberating RAM
for more useful purposes. But the only time this box ever has
a heavy load is during parallel compilation, which never seems
to use even half the RAM available.
- Swap space provides a sort of cushion in case memory is about to
be exhausted: responsiveness degrades more slowly, and perhaps
that provides an opportunity to kill a rogue process before the
OOM handler is triggered. But a process that can eat 64 GB may
just as well eat 164 GB; and without the audible feedback of an
old-fashioned rotary HDD, I'm not sure I'd notice a problem in
time to do anything about it.
So I'm inclined to suppress swapping altogether. Is that unwise?
At any rate, this box has two SSDs, each of which has a dedicated
swap partition (rationale: it should still work if I remove one of
the drives, or if one gets bricked), and the debian installer tries
to use both; I think I should use one at most, if not zero. It seems
silly enough to use a 4 GB swap partition, but using two for a total
of 8 GB is surely much more trouble than it's worth.
- [lmi] More robust multibooting,
Greg Chicares <=