coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] dd: add punchhole feature


From: Pádraig Brady
Subject: Re: [PATCH] dd: add punchhole feature
Date: Mon, 6 Feb 2017 20:19:45 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0

On 03/02/17 04:58, address@hidden wrote:
> Hello,
> 
> I sometimes face some machine with big log file that take 90% of partition 
> space.
> If those logs are importants I can't just remove it to free space and have to 
> archive it (gzip usually).
> But the log file + it's archive doesn't fit in the partition so I can't just 
> `gzip my.log`.
> On situation like these I usually do :
> 
>     $ gzip -c my.log | dd of=my.log conv=notrunc
>
>     X bytes (…) copied, …
>     $ truncate -s X my.log
> 
> But when my.log is opened by another process it's not recommended ;
> as I would ending up with my.log containing a zip and new logs (non zipped) 
> at the end.
> 
> I end-up developing: https://github.com/tchernomax/dump-deallocate
> A some utility that output and deallocate (fallocate punch-hole) a file at 
> the same time.
> 
> I think it would be interesting to include this feature in dd so it would be 
> possible to do:
> 
>     $ dd if=my.log conv=punchhole | gzip > my.log.gzip

That's not a robust operation as if gzip fails for any reason
like disk full etc. some data will be lost.
So while punchhole functionality might be useful,
I'm not so sure about coupling it just with read()?
BTW there is already a punch_hole() function in copy.c
that should be reused if we were to add this.
The reason we haven't added just punchhole functionality to dd,
is because it's already available from fallocate(1).

It seems like a specialized tool to couple the following ops would be required:

  while (read(chunk))
    compress
    write
    if (sync())
      collapse_range(chunk)

Note I used collapse_range rather than punch_hole there
as that would probably simplify restarts for partial completions,
as only the unprocessed data would be left in the file.

We were talking about an inplace(1) tool previously,
that would take a filter to process a file with.
This could be a possible mode of operation for it (with an option).

cheers,
Pádraig



reply via email to

[Prev in Thread] Current Thread [Next in Thread]