savannah-register-public
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-register-public] [task #7250] Submission of parallel


From: Ole Tange
Subject: [Savannah-register-public] [task #7250] Submission of parallel
Date: Mon, 27 Aug 2007 14:20:48 +0000
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.6) Gecko/20070723 Iceweasel/2.0.0.6 (Debian-2.0.0.6-0etch1+lenny1)

URL:
  <http://savannah.gnu.org/task/?7250>

                 Summary: Submission of parallel
                 Project: Savannah Administration
            Submitted by: tange
            Submitted on: Monday 08/27/2007 at 16:20
         Should Start On: Monday 08/27/2007 at 00:00
   Should be Finished on: Thursday 09/06/2007 at 00:00
                Category: Project Approval
                Priority: 5 - Normal
                  Status: None
                 Privacy: Public
        Percent Complete: 0%
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
                  Effort: 0.00

    _______________________________________________________

Details:

A new project has been registered at Savannah 
This project account will remain inactive until a site admin approves or
discards the registration.


= Registration Administration =

While this item will be useful to track the registration process, *approving
or discarding the registration must be done using the specific Group
Administration
<https://savannah.gnu.org/siteadmin/groupedit.php?group_id=9478> page*,
accessible only to site administrators, effectively *logged as site
administrators* (superuser):

* Group Administration
<https://savannah.gnu.org/siteadmin/groupedit.php?group_id=9478>


= Registration Details =

* Name: *parallel*
* System Name:  *parallel*
* Type: non-GNU software & documentation
* License: GNU General Public License v3 or later

----

==== Description: ====
NAME
       parallel - run jobs in parallel

SYNOPSIS
       parallel [-g] [-j N] [-s] [command] < list_of_arguments

DESCRIPTION
       For each line of input parallel will execute command with the line as
       arguments. If no command is given the line is executed.

       Several lines will be run in parallel.

       command  If command contains {} every instance will be substituted
with
                the arguments.

       -g       Group output. Avoid output from each job running together
with
                other jobs. Will only print output when the job is done.
                stderr is merged with stdout.

       -j N     Run N jobs in parallel. Default is 10.

       -j +N    Add N to the number of CPUs. Run this many jobs in parallel.
                For compute intensive jobs -j +0 is useful.

       -j -N    Subtract N from the number of CPUs. Run this many jobs in
par‐
                allel.  If the evaluated number is less than 1 then 1 will
be
                used.

       -j N%    Multiply N% with the number of CPUs. Run this many jobs in
                parallel.  If the evaluated number is less than 1 then 1
will
                be used.

       -s       Silent. Do not print the job to be run.

       -x       eXact one argument per line. If the lines are filenames that
                may contain shell special characters, (such as space or *)
                then this will protect the characters from being interpreted
                by the shell.

EXAMPLE 1: Ressource inexpensive jobs and grouping
       A ressource inexpensive job is a job that takes very little CPU, disk
       I/O and network I/O. Ping is an example of a ressource inexpensive
job.
       wget is too - if the webpages are small.

       The content of the file jobs_to_run:

         ping -c 1 10.0.0.1
         wget http://status-server/status.cgi?ip=10.0.0.1
         ping -c 1 10.0.0.2
         wget http://status-server/status.cgi?ip=10.0.0.2
         ...
         ping -c 1 10.0.0.255
         wget http://status-server/status.cgi?ip=10.0.0.255

       To run 100 processes simultaneously do:

         parallel -j 100 < jobs_to_run

       The output of the commands will run together. If it is important to
       keep the outputs separated use -g (grouping):

         parallel -gj 100 < jobs_to_run

       This will print the output of each job only when the job is finished.

EXAMPLE 2: Argument appending, grouping, slient, and exact
       parallel can work similar to ’xargs -n1’.

       To output all html files run:

         find . -name ’*.html’ | parallel cat

       As the output here will run together grouping is adviced:

         find . -name ’*.html’ | parallel -g cat

       If the output is to be used as input for another program it may be a
       good idea not to print the command being run using -s (silent):

         find . -name ’*.html’ | parallel -sg cat

       If some of the filenames have special characters (eg. a file called
       ’**foo & bar*.html’) then force interpreting the lines exact with
-x:

         find . -name ’*.html’ | parallel -xsg cat

EXAMPLE 3: Compute intensive jobs and substitution
       If ImageMagick is installed this will generate a thumbnail of a jpg
       file:

         convert -geometry 120 foo.jpg thumb_foo.jpg

       If the system has more than 1 CPU it can be run with number-of-cpus
       jobs in parallel (-j +0). This wil do that for all jpg files in a
       directory:

         ls *.jpg | parallel -j +0 convert -geometry 120 {} thumb_{}

       To do it recursively:

         find . -name ’*.jpg’ | parallel -j +0 convert -geometry 120 {}
{}_thumb.jpg

       Notice how the argument has to start with {} as {} will include path
       (e.g. running "convert -geometry 120 ./foo/bar.jpg
thumb_./foo/bar.jpg"
       would clearly be wrong). It will result in files like
       ./foo/bar.jpg_thumb.jpg. If that is not wanted this can fix it:

         find . -name ’*.jpg’ | \
         perl -pe ’chomp; $a=$_; s:/([^/]+)$:/thumb_$1:; $_="convert
-geometry 120 $a $_\n"’ | \
         parallel -j +0

       Unfortunately this will not work if the filenames contain special
char‐
       acters (such as space or quotes). If you have ren installed this is a
       better solution:

         find . -name ’*.jpg’ | parallel -j +0 convert -geometry 120 {}
{}_thumb.jpg
         find . -name ’*_thumb.jpg’ | ren
’s/_thumb.jpg//;s/^/thumb_/’

EXAMPLE 4: Substituion and redirection
       This will compare all files in the dir to the file foo and save the
       diffs in corresponding .diff files:

         ls | parallel diff {} foo ">"{}.diff

       Quoting of > is necessary to postpone the redirection. Another
solution
       is to quote the whole command:

         ls | parallel "diff {} foo >{}.diff"

EXAMPLE 5: Composed commands
       A job can consist of several commands. This will print the number of
       files in each directory:

         ls | parallel -sg ’echo -n {}" "; ls {}|wc -l’

QUOTING
       For more advanced use quoting may be an issue. The following will
print
       the filename for each line that has exactly 2 columns:

         perl -ne ’/^\S+\s+\S+$/ and print $ARGV,"\n"’ file

       To do that using parallel you will do something like this:

         ls | parallel -sg "perl -ne ’/^\\S+\\s+\\S+$/ and print
\$ARGV,\"\\n\"’"

       Notice how you need to quote \’s, "’s, and $’s.

       To avoid dealing with the quoting problems it may be easier just to
       write a small script and have parallel call that script.

BUGS
       As parallel (ab)uses make to make the jobs in parallel limitations
from
       make apply. For old versions of make (before 3.81) this means that
the
       initialization will take O(n*n) where n is the number of jobs to be
       executed. To have a fair compromise in initialization I have picked a
       chunk size of 5000. When 3.81 has become standard for a while this
       chunksize should probably be removed. The cost, however, for having
the
       chunksize seems neglicible.

AUTHOR
       2007-07-23,2007-08-09 Ole Tange, http://ole.tange.dk

LICENSE
       This program is free software; you can redistribute it and/or modify
it
       under the terms of the GNU General Public License as published by the
       Free Software Foundation; either version 3 of the License, or (at
your
       option) any later version.

       This program is distributed in the hope that it will be useful, but
       WITHOUT ANY WARRANTY; without even the implied warranty of MER‐
       CHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
General
       Public License for more details.

       You should have received a copy of the GNU General Public License
along
       with this program.  If not, see <http://www.gnu.org/licenses/>.

DEPENDENCIES
       parallel uses GNU Make, Perl, and the Perl module Getopt::Std.

SEE ALSO
       make(1), xargs(1)



==== Other Software Required: ====
DEPENDENCIES
       parallel uses GNU Make, Perl, and the Perl module Getopt::Std.






    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/task/?7250>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/





reply via email to

[Prev in Thread] Current Thread [Next in Thread]