[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
patches for multi-threaded cp and md5sum (along with other features)
From: |
Paul Kolano (ARC-TN)[InuTeq, LLC] |
Subject: |
patches for multi-threaded cp and md5sum (along with other features) |
Date: |
Mon, 3 Jun 2019 21:29:20 +0000 |
User-agent: |
Microsoft-MacOutlook/10.10.9.190412 |
Greetings,
Many years ago, I developed a set of patches to add a number of features to cp
and md5sum including multi-threading, partial copies, direct i/o, asynchronous
read/writes, checksum during copy, multi-host ssh-/MPI-based copies, Lustre
support, preallocation, files over stdin, and stats output. These offer
significant performance benefits along with greater flexibility for use in
other purposes (in particular, the partial copy and files over stdin features).
You can see details here:
https://pkolano.github.io/projects/mutil.html
The code is stable and has been used for almost 10 years in production at the
NASA Advanced Supercomputing division to transfer many, many PBs of scientific
data. It is also used as one of the underlying transports in a separate
project (https://pkolano.github.io/projects/shift.html) to provide high
performance tar creation/extraction and integrity verification/rectification.
I do not have time to keep it in sync with every coreutils release so it is
still based on 8.22, but is usually straightforward to bring it up to date.
Just wanted to inquire if there was any interest in incorporating some/all of
these patches into the mainline cp/md5sum code so that the greater coreutils
base of users can benefit from them. I can assist in updating the code to the
latest coreutils, pruning out features of interest, etc. Please let me know if
there is any interest.
thanks,
--Paul
- patches for multi-threaded cp and md5sum (along with other features),
Paul Kolano (ARC-TN)[InuTeq, LLC] <=