[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RFC: devmapping cutouts?
From: |
Ivan Shmakov |
Subject: |
RFC: devmapping cutouts? |
Date: |
Thu, 26 May 2022 03:12:24 +0000 |
The -cut-out option, as recently amended [1], allows one to
store one or more fragments of data from a block device on
an ECMA 119 [2] filesystem. The obvious use case for that
is to store data across several optical disks that exceeds
the capacity of one.
[1] http://bugs.debian.org/1010098
[2] http://ecma-international.org/publications-and-standards/standards/ecma-119/
A somewhat rarer case is to avoid storing large spans of
value-zero bytes present in the source data.
This can occur when said data is a regularly-“trimmed”
(e. g., with [2]) filesystem residing on a flash-based drive;
or when the allocated storage was simply never fully written
over (such as when much more space was allocated for a
filesystem than ever got used.) There, it’s possible to
only store on the target filesystem the ranges that aren’t
entirely filled with zeros.
[3] http://manpages.debian.org/sid/fstrim.8
Suppose that such a filesystem was created. What would be
an efficient way to access its contents as if it were the
original block device?
Arguably it’d be through the creation of a devmapper block
device with a table mapping block ranges of said block
device either to file data on the given ECMA 119 filesystem,
or to the ‘zero’ target, as appropriate.
Note that it’s possible to bypass the filesystem layer here
(assuming that the ECMA 119 filesystem resides on a block
device, such as an optical disk) by consulting the output of
xorriso(1) -find -exec report_lba command, like:
$ xorriso \
-indev stdio:/dev/BACKUP \
-find / -exec report_lba -- \
-rollback-end
xorriso 1.5.0 : RockRidge filesystem manipulator, libburnia project.
xorriso : NOTE : Loading ISO image tree from LBA 0
xorriso : UPDATE : 1218 nodes read in 1 seconds
Drive current: -indev 'stdio:/dev/BACKUP'
Media current: stdio file, overwriteable
Media status : is written , is appendable
Media summary: 1 session, 2293183 data blocks, 4479m data, 0 free
Volume id : '3FD2B6132D3543A196F288D2A82A7A42'
Report layout: xt , Startlba , Blocks , Filesize , ISO image path
File data lba: 0 , 2293070 , 84 , 171690 ,
'/private/backups/.mtree/2022-05-21'
File data lba: 0 , 2293154 , 61 , 123688 ,
'/private/backups/.sha256/2022-05-21'
File data lba: 0 , 135 , 195 , 399360 ,
'/private/backups/lvfoo-z598d01-x6288bb2c/+0'
File data lba: 0 , 330 , 36 , 73728 ,
'/private/backups/lvfoo-z598d01-x6288bb2c/+10000000'
File data lba: 0 , 366 , 5979 , 12244992 ,
'/private/backups/lvfoo-z598d01-x6288bb2c/+100202000'
File data lba: 0 , 6345 , 1466 , 3002368 ,
'/private/backups/lvfoo-z598d01-x6288bb2c/+100dd0000'
File data lba: 0 , 7811 , 101 , 206848 ,
'/private/backups/lvfoo-z598d01-x6288bb2c/+1010d0000'
…
Here, the filenames record the byte offset of the fragment
in hexadecimal; so that e. g. +10000000 is 256 MiB from the
start of the image, +100dd0000 is 4208448 KiB, etc.
(Arguably it makes even more sense to use Crockford Base32
encoding for the offsets, so the aforementioned would end up
being pretty concise 800000 and 40DT000, respectively; with
only 8 Base32 digits being necessary to encode offsets within
1 TiB.)
It’s easy to transform the output above into a table
suitable for passing to the # dmsetup create command (note
that devmapper operates on 512 byte blocks rather than bytes):
0 780 linear /dev/BACKUP 540
780 3356 zero
4136 23820 linear /dev/BACKUP 2562196
27956 260 zero
28216 101100 linear /dev/BACKUP 7999556
129316 348 zero
…
Or, what I’ve actually used, is a list of arguments to my
dmmontage [4] convenience wrapper (albeit largely superfluous
in this case.)
--target=lvfoo-z598d01-x6288bb2c --
@0 /dev/BACKUP:540-1320
@780 /dev/zero:0-3356
@4136 /dev/BACKUP:2562196-2586016
@27956 /dev/zero:0-260
@28216 /dev/BACKUP:7999556-8100656
@129316 /dev/zero:0-348
…
(Line broken for readability.)
[4] http://am-1.org/~ivan/src/blkutils-2022/dmmontage.sh
The conversion code is as follows.
#!/usr/bin/perl
### Ivan Shmakov, 2022
## To the extent possible under law, the author(s) have dedicated
## all copyright and related and neighboring rights to this software
## to the public domain worldwide. This software is distributed
## without any warranty.
## You should have received a copy of the CC0 Public Domain Dedication
## along with this software. If not, see
## <http://creativecommons.org/publicdomain/zero/1.0/>.
### Code:
use common::sense;
my $orig
= shift (@ARGV);
my ($prev, %acc)
= ();
sub print_out {
## .
return
unless (defined ($prev));
print ("--target=",
$prev =~ s/-([0-9]+)$/${ \sprintf ("-x%x", $1); }/r,
".thin --");
my $pos
= 0;
foreach my $o (sort { $a <=> $b; } (keys (%acc))) {
printf (" @%d /dev/zero:0-%d", $pos, $o - $pos)
if ($o != $pos);
print (" @", $o, ($o != $acc{$o}->[1] ? (" ", $acc{$o}->[0]) : ()));
$pos
= $acc{$o}->[1];
}
print ("\n");
}
while (<STDIN>) {
my ($so, $z, $ta, $to) = m {
^ File\sdata\slba:
\s+ [0-9]+ [ ,]+ ([0-9]+) .*\b ([0-9]+)
\s* , .*? / ([^/]+-x?[0-9a-f]+)
/ [0-9a-f]* [+] ([0-9a-f]+) \b
}x or next;
if ($prev ne $ta) {
print_out ();
($prev, %acc)
= ($ta);
}
$so <<= 2;
$z >>= 9;
$to = do {
no warnings;
(hex ("0x" . $to) >> 9);
};
# warn ("D: ", join (" ", $so, $z, $ta, $to), "\n") if (0);
$acc{$to}
= [ sprintf ("%s:%d-%d", $orig, $so, $z + $so), $z + $to ];
}
print_out ()
if (defined ($prev));
Two more issues to consider are: a. locating the 0-filled
ranges in the source, as well as balancing the amount of
data /not/ stored vs. the number of files / ranges needed;
and b. extending this simple ‘bunch of files with filenames
being offsets’ convention in such a way as to allow for
incremental backups of filesystem (or other large dataset,
where it might make sense) snapshots.
For instance, in this particular case, 105568 individual
ranges of 0-filled 512-byte blocks (2411608 in total) were
identified on the filesystem. In order to save on the
filesystem and dmsetup overhead, 1097588 of such 512-byte
0-filled blocks (across 104359 ranges) were disregarded
(as in: kept in the resulting archive), while 1314020
512-byte blocks across 1209 ranges were “cut out” and not
archived. Yet this still allowed me to fit a 5 GiB FS
snapshot on a single 4482 MiB DVD+R blank.
Thoughts?
--
FSF associate member #7257 http://am-1.org/~ivan/
- RFC: devmapping cutouts?,
Ivan Shmakov <=