coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cp --reflink=auto by default


From: Dave Chinner
Subject: Re: cp --reflink=auto by default
Date: Thu, 21 May 2015 07:56:29 +1000
User-agent: Mutt/1.5.21 (2010-09-15)

On Wed, May 20, 2015 at 02:25:37PM +0200, Lennart Poettering wrote:
> On Wed, 20.05.15 12:41, Pádraig Brady (address@hidden) wrote:
> 
> > On 20/05/15 11:48, Lennart Poettering wrote:
> > > On Tue, 19.05.15 02:33, Pádraig Brady (address@hidden) wrote:
> > > 
> > >> FYI...
> > >>
> > >> mv reflinks by default, but only in the unreleased V8.24 (Fedora 23).
> > >>
> > >> cp doesn't default to --reflink=auto as that would break the case where 
> > >> one uses copy
> > >> for durability reasons to have a second copy of the data.  Also for 
> > >> performance reasons
> > >> you may want the writes to happen at copy time rather than some latency 
> > >> sensitive process
> > >> working on a CoW file and being delayed by the writes possibly to a 
> > >> different part of a mechanical disk.
> > > 
> > > I am pretty sure that both those usecases are of the more exotic kind,
> > > and that reflinks should hence be the default, and people who want the
> > > byte-by-byte kind of copy should request it explicitly with
> > > --reflink=no or dd.
> > > 
> > > I think a good user interface make the common operations easy (and
> > > hence default) and the exotic ones possible.
> > 
> > Well I certainly agree on that generic point:
> > http://www.pixelbeat.org/docs/power_of_the_default.html

IF a distro wants to change the default, then they can just dump

alias cp='cp --reflink=auto'

into their /etc/bash.bashrc (or equivalent global shell config
files) and nothing in cp needs to change.

> > > For me that clearly means
> > > that --reflink=auto should be the default, and --reflink=no the
> > > option, and *not* the other way round...
> > 
> > This is something we may consider changing in coreutils >= 9.
> > Especially considering data deduplication is being added
> > to more and lower layers, which makes the first point
> > about implicit bit duplication less valid.
> >
> > The performance concern is still valid, though again less so with
> > SSDs.
> 
> Well, sure, but it's a balance. It might make the later access a bit
> slower due to fragmentation, but cp itself would become a *ton* faster...

Ok, so instant gratification strikes again. We have to pay for that
copy somewhere. Use reflink like this and you will effectively turn
every reflink-capable filesystem into COW filesystem.

This is *not* a good thing, because all the drawbacks to COW-based
filesystems (fragmentation, poor aging characteristics and bad read
performance) would be brought to filesystems that don't use COW but
support reflink for specific purposes (e.g. so gluster and ceph can
offload file snapshots).

Many applications depend on file layout and exclusivity of their
data sets for performance and reliabilty, and reflink breaks those
assumptions.  i.e. excessive use of reflink will lead to "why is my
filesystem slow" questions directed at filesystem mailing lists.
e.g. "I copy a file, and now the test program only runs at 50%
speed". That's not a problem cp devs will likely hear directly
about, that's something the filesystem developers will end up having
to deal with...

FWIW, if cp starts using reflink by default, the next thing I expect
to be asked is for xfs_fsr to be able to break reflink copied
files and lay them out sanely so that performance can be restored.
:/

Cheers,

Dave.
-- 
Dave Chinner
address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]