bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Degraded performance in cat + patch


From: Pádraig Brady
Subject: Re: Degraded performance in cat + patch
Date: Fri, 6 Mar 2009 12:40:20 +0000
User-agent: Thunderbird 2.0.0.6 (X11/20071008)

Pádraig Brady wrote:
> Pádraig Brady wrote:
>> Jim Meyering wrote:
>>> >From 6dd9c564a0cba6eec95102f091c6692a5ab48876 Mon Sep 17 00:00:00 2001
>>> From: Jim Meyering <address@hidden>
>>> Date: Fri, 6 Mar 2009 10:27:43 +0100
>>> Subject: [PATCH] cat: use larger buffer sizes to reduce read/write-syscall 
>>> overhead
>>>
>>> * src/cat.c (max): Remove definition.  Use MAX from system.h instead.
>>> (compute_buffer_size): New function.
>>> (main): Use it, to compute larger input and output buffer sizes
>>> derived from st_blksize, now typically 32KiB rather than 4KiB.
>>> Suggestion from Tzvi Rotshtein.
>> That sounds like previously cat did not derive from st_blksize
>> and that st_blksize is typically 32KiB :) Suggested log message:
>>
>> * src/cat.c (max): Remove definition.  Use MAX from system.h instead.
>> (compute_buffer_size): New function to compute the input and output
>> buffer sizes, which are now set at 8 times st_blksize with
>> a minimum of 32KiB. Previously the typical block sizes used were
>> 1KiB for pipes and 4KiB for files.
>> (main): Use it.
>> This was seem to increase throughput by up to 50%.
>> Suggestion from Tzvi Rotshtein.
> 
> Oops :) accurate one below I think:
> 
> * src/cat.c (max): Remove definition.  Use MAX from system.h instead.
> (compute_buffer_size): New function to compute the input and output
> buffer sizes, which are now set at 8 times st_blksize with a maximum
> of 32KiB. Previously the typical block sizes used were 1KiB for pipes
> and 4KiB for files, and now will be 8KiB and 32KiB respectively.
> (main): Use it.
> This was seem to increase throughput by up to 50%.
> Suggestion from Tzvi Rotshtein.

Actually reading your preformance results more closely showed
the throughput actually doubled? That surprises me.
Why such a huge syscall overhead? Testing with dd on a
1.7GHz pentium-m with 2.6.24.5-85.fc8 shows much less:

$ truncate -s2G test.cat

$ dd bs=4x1024 if=test.cat of=/dev/null
2147483648 bytes (2.1 GB) copied, 6.57765 s, 326 MB/s
$ dd bs=32x1024 if=test.cat of=/dev/null
2147483648 bytes (2.1 GB) copied, 5.74548 s, 374 MB/s

So trying with cat...

$ /usr/bin/time ./cat test.cat >/dev/null
0.29user 5.28system 0:06.55elapsed 85%CPU (0avgtext+0avgdata 0maxresident)k
472inputs+0outputs (2major+149minor)pagefaults 0swaps

Notice that we're waiting for something even though there is no data on disk!
That seems like a separate kernel issue, anyway reducing the file size
to fit in the page cache on this 2GB RAM machine and repeating...

$ truncate -s1G test.cat

$ /usr/bin/time ./cat-4K test.cat >/dev/null
0.13user 1.30system 0:01.44elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+150minor)pagefaults 0swaps

$ /usr/bin/time ./cat-32K test.cat >/dev/null
0.01user 1.17system 0:01.19elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+157minor)pagefaults 0swaps

Again that's a modest improvement, though worth changing the buffer size for.
Is there some massive syscall overhead in rawhide kernels or something at the 
moment?

cheers,
Pádraig.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]