coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sparse file processing improvements


From: Pádraig Brady
Subject: Re: sparse file processing improvements
Date: Wed, 08 Oct 2014 00:08:28 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 10/07/2014 07:56 PM, Pádraig Brady wrote:
> On 10/07/2014 02:50 AM, Bernhard Voelker wrote:
>> On 10/06/2014 04:40 PM, Pádraig Brady wrote:
>>> I've attached 3 patches to improve various sparse file handling.
>>> 1. supports detecting holes < internal buffer size (currently 128KiB)
>>> 2. employs the punch hole scheme demonstrated above
>>> 3. reads sparse files efficiently even if not writing to a regular file
>>>
>>> thanks,
>>> Pádraig.
>>
>> Hi Padraig,
>>
>> cool work, thanks!
>>
>> At a first glance, the changes in copy.c seem to be correct.
>> Without having a deeper look yet, the new inner loop to detect
>> holes in the just-read buffer seems to do the lseek() twice if
>> a hole would span adjacent, outer-loop read()s.
>> The ls(1) output of the result in the new test seems to second
>> my guess:
>>
>>    + ls -lsh sparse.in sparse.out sparse.out2
>>   512K -rw-r--r-- 1 voelkerb users 512K Oct  6 20:03 sparse.in
>>   260K -rw-r--r-- 1 voelkerb users 512K Oct  6 20:03 sparse.out
>>   256K -rw-r--r-- 1 voelkerb users 512K Oct  6 20:03 sparse.out2
>>
>> This means the copying could be even more effective in the first
>> run, right?
> 
> Yes the patch would do at least 1 lseek per outer read loop.
> The extra 4K above I think is just a side effect on ext4 of the trailing 
> lseek,
> rather than just leaving it to truncate, or maybe just alignment slop.
> But you're right, both could be improved by hoisting the hole control
> vars above the read loop and only doing 1 lseek for each hole,
> and only a truncate for the last hole. The fiemap based copy
> also avoids the last lseek, so this would be more consistent as well.
> 
> The incremental diff is simple enough:
> 
> diff --git a/src/copy.c b/src/copy.c
> index 12af6db..d32a131 100644
> --- a/src/copy.c
> +++ b/src/copy.c
> @@ -165,6 +165,8 @@ sparse_copy (int src_fd, int dest_fd, char *buf, size_t 
> buf_size,
>  {
>    *last_write_made_hole = false;
>    *total_n_read = 0;
> +  bool make_hole = false;
> +  off_t psize = 0;
> 
>    while (max_n_read)
>      {
> @@ -182,10 +184,8 @@ sparse_copy (int src_fd, int dest_fd, char *buf, size_t 
> buf_size,
>        *total_n_read += n_read;
> 
>        /* Loop over the input buffer in chunks of hole_size.  */
> -      bool make_hole = false;
>        size_t csize = make_holes ? hole_size : buf_size;
>        char *cbuf = buf;
> -      size_t psize = 0;
>        char *pbuf = buf;
> 
>        while (n_read)
> @@ -207,7 +207,7 @@ sparse_copy (int src_fd, int dest_fd, char *buf, size_t 
> buf_size,
>              }
> 
>            bool transition = (make_hole != prev_hole) && psize;
> -          bool last_chunk = (n_read == csize) || ! csize;
> +          bool last_chunk = (n_read == csize && ! make_hole) || ! csize;
> 
>            if (transition || last_chunk)
>              {
> @@ -231,16 +231,29 @@ sparse_copy (int src_fd, int dest_fd, char *buf, size_t 
> buf_size,
>                      }
>                  }
> 
> -              pbuf += psize;
> +              pbuf = cbuf;
>                psize = csize;
> 
> -              if (transition && last_chunk)
> -                csize = 0;
> +              if (last_chunk)
> +                {
> +                  if (transition)
> +                    csize = 0; /* Loop again to deal with last chunk.  */
> +                  else
> +                    psize = 0; /* Reset for next read loop.  */
> +                }
>                else if (! csize)
> -                n_read = 0;
> +                n_read = 0;    /* Finished processing buffer.  */
>              }
>            else  /* Coalesce writes/seeks.  */
> -            psize += csize;
> +            {
> +              if (psize <= OFF_T_MAX - csize)
> +                psize += csize;
> +              else
> +                {
> +                  error (0, 0, _("overflow reading %s"), quote (src_name));
> +                  return false;
> +                }
> +            }
> 
>            n_read -= csize;
>            cbuf += csize;
> 
>>> diff --git a/NEWS b/NEWS
>>> index 1811ae4..785773f 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -30,6 +30,9 @@ GNU coreutils NEWS                                    -*- 
>>> outline -*-
>>>
>>>  ** Improvements
>>>
>>> +  cp will convert smaller runs of NULs in the input to holes,
>>> +  to reduce allocation in the copy.
>>> +
>>
>> Aren't ginstall(1) and mv(1) also somehow affected?  If so, then
>> the NEWS should mention this.
> 
> done
> 
>>> diff --git a/tests/cp/sparse.sh b/tests/cp/sparse.sh
>>> index d6cc4c4..1414d35 100755
>>> --- a/tests/cp/sparse.sh
>>> +++ b/tests/cp/sparse.sh
>>> @@ -37,4 +37,32 @@ test $(stat --printf %b copy) -le $(stat --printf %b 
>>> sparse) || fail=1
>>>  cp --sparse=always --reflink sparse copy && fail=1
>>>  cp --sparse=never --reflink sparse copy && fail=1
>>>
>>> +
>>> +# Ensure we handle sparse/non-sparse transitions correctly
>>> +maxn=128 # how many $hole_size chunks per file
>>> +hole_size=$(stat -c %o copy)
>>> +dd if=/dev/zero bs=$hole_size count=$maxn of=zeros
>>> +tr '\0' '\1' < zeros > ones
>>
>> Test framework failures during the above commands (and below)
>> should be caught.
> 
> done
> 
>>> +for n in 1 2 3 4 32 $maxn; do
>>> +  parts=$(expr $maxn / $n)
>>> +
>>> +  rm -f sparse.in
>>> +
>>> +  # Generate sparse file for copying with alternating
>>> +  # hole/data patterns of size n * $hole_size
>>> +  for i in $(yes zeros | sed 1~2s/zeros/ones/ | head -n$parts); do
>>> +    dd iflag=fullblock if=$i of=sparse.in conv=notrunc oflag=append \
>>> +       bs=$hole_size count=$n status=none || framework_failure_
>>> +  done
>>
>> a)
>> Looking at the ls(1) output above, the generated sparse.in files are
>> not sparse, are they? I think it doesn't matter, and in fact the file
>> to copy first does not have to be sparse. This is what the second
>> copying below is for. So just s/sparse/sample/ in the comment maybe.
> 
> Right the first file should not be sparse.
> Otherwise we get only the fiemap code for handling holes.
> Comment fixed.
> 
>> b)
>> Instead of appending to the sample file with "of=sparse.in conv=notrunc"
>> " oflag=append", you could just redirect the for-loop's output to that
>> file.
> 
> maybe, though more data copying in that case
> 
>> c)
>> I'm not sure if the sed command "1~2s" is available everywhere
>> (TBH I've seen it the first time myself).
>> What about avoiding the sed(1) call completely by moving the
>> alternating logic into the yes(1) call?
>>
>>   s="$(printf "%s\n%s" zeros ones)"
>>   for i in $(yes "$s" |  head -n$parts); do
>>     ...
> 
> Good point I need to make that portable.
> 
>>> +
>>> +  cp --sparse=always sparse.in sparse.out   || fail=1 # non sparse input
>>> +  cp --sparse=always sparse.out sparse.out2 || fail=1 # sparse input
>>> +
>>> +  cmp sparse.in sparse.out || fail=1
>>> +  cmp sparse.in sparse.out2 || fail=1
>>> +
>>> +  ls -lsh sparse.*
>>
>> Given the sparse-detection and -punching is optimal, then both
>> output file should always have the same size, shouldn't they?
>> Therefore, the test should check those numbers.
> 
> I'm very wary of checking the numbers as they could vary
> due to alignment, or async reclamation on the file system.
> 
>>> +done
>>> +
>>>  Exit $fail
>>
>> Although the test data seems to be quite good, I somehow
>> have the feeling that it should be even more random.
>> In the above, the blocks of NULs and 1s are always a multiple
>> of $holesize ... which in turn makes it all somehow related
>> to the buffer size and read size in sparse_copy().
>> I think we should use varying block sizes around the block size
>> for the NULs and 1s blocks to be closer to real-world data.
>> WDYT?
> 
> Well going below the $holesize would only test is_nul(),
> and that's implicitly tested with the data comparisons.
> Oh I should change from \1 to 'U' since \1 is used as the
> sentinel flag within the code.
> 
> I also noticed small issue in copy where an erroneous
> error about extending the file would be output when
> reporting errors when sparse_copy fails. That's fixed
> up is a separate patch. All 4 patches are attached here.

Actually I'll need to rework this set slightly to
keep the last lseek in sparse_copy() as it's needed
for the fiemap case, when there are non empty adjacent
extents that end in NULs.

Pádraig.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]