|
From: | Pádraig Brady |
Subject: | Re: [PATCH] cksum: Use pclmul hardware instruction for CRC32 calculation |
Date: | Sun, 14 Mar 2021 23:42:00 +0000 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0 |
On 13/03/2021 20:37, Jim Meyering wrote:
On Sat, Mar 13, 2021 at 10:19 AM Pádraig Brady <P@draigbrady.com> wrote:On 13/03/2021 16:13, Pádraig Brady wrote: FYI testing on an older i3-2310M system shows the bottleneck is not near I/O (cat is much faster). A 500MiB file improves from 1.40s to 0.67s on the i3-2310M. $ time src/cksum file.in 3404199294 524288000 file.in real 0m0.672s user 0m0.584s sys 0m0.084s I'm also considering applying the attached to add a --debug option (present on a few other coreutils), which will diagnose the implementation used (since it's build time and run time variable).I like the new option, and the patch looks fine. I assume you'll mention the addition in NEWS.
Thanks for the reminder. I've adjusted like: diff --git a/NEWS b/NEWS index aad05df6d..5368a3eed 100644 --- a/NEWS +++ b/NEWS @@ -70,7 +70,9 @@ GNU coreutils NEWS -*- outline -*- cat --show-ends will now show \r\n as ^M$. Previously the \r was taken literally, thus overwriting the first character in the line with '$'. - cksum is now up to 4 times faster by using a slice by 8 algorithm. + cksum is now up to 4 times faster by using a slice by 8 algorithm, + and up to 8 times faster where pclmul instructions are supported. + A new --debug option will indicate if pclmul is being used. df now recognizes these file systems as remote: acfs, coda, fhgfs, gpfs, ibrix, ocfs2, and vxfs.
[Prev in Thread] | Current Thread | [Next in Thread] |