bug-parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GNU Parallel Bug Reports feature suggestion: approximate scheduler


From: Håkan Johansson
Subject: GNU Parallel Bug Reports feature suggestion: approximate scheduler
Date: Thu, 3 Sep 2015 00:56:09 +0200
User-agent: Alpine 2.02 (DEB 1266 2009-07-14)


Hi,

we are happily using GNU parallel to process data files on a diverse set of workstations. What would be even nicer is if parallel could do some sort of scheduling to reduce the waiting time for the last few files to finish processing.

I did not find any options to that effect, please correct me if wrong.

Otherwise, description of a typical invocation:

- ~200 files of varying size (between 10 MB and 1 GB)
- the command to execute
- a list of 2 - 15 machines of varying speed (within a factor ~4)
  and number of cores

The processing is single-threaded, so the list of machines also tell how many jobs to run on each box.

We often reanalyse the same set of files, or sets with partial overlap of files. The command changes a bit, but execution time/work per file is rather similar. Execution time is roughly proportional to file size, but depending on the actual content, some files may be involve more work.

What I would imagine is something like this:

- an option to request parallel to try to schedule the jobs such that they
  end more simultaneously (and thus sooner).  Given no other information,
  the estimate  would be that it takes time proportional to the input {}
  file size to complete the job.  I.e. start scheduling the largest
  files and end with the smallest ones.

- an option to read a machine-performance file, which gives the relative
  performance of the machines.  (If some is missing, assume a default
  value of e.g. 1).

  Especially as the end is being reached, completetion may be quicker
  if parallel would not use the slowest machines at all, or only for
  small enough files.

- an option to read an file of input-file difficulty corrections, i.e.
  telling (relatively) how 'large' each file effectively is.

The reason for having the relative measurements in external files is that parallel cannot determine these values within one run. One has to learn between invocations. But as input files and machines may be used in different ways, i.e. with different commands, better have it in files that the user can/has to change. In order to produce those two files:

- an option to produce a log file listing for each job:

  date/time of job,  machine,  input-file,  size-of-input-file,  time used

A separate program could then process (multiple) such files to produce the machine and input-file correction files. From just one invocation it will not be possible, but as soon as either the same file has been processed on multiple machines, those machines can get relative numbers. Or different files on the same machine, then the files can be assigned effective sizes. It will not be necessary to have executed all combinations as the assignment can be treated as an overdetermined sparse least-squares problem. With one equation from each job-invocation, where one expects:

file_size / time_used = machine_performance / file_difficulty

I.e. effectively measured in bytes/second.  After taking logarithms:

log(machine_performance) - log(file_difficulty) =
  = log(file_size) - log (time_used)

Where log(machine_performance) and log(file_difficulty) are the unknowns
that are solved for. One such equation is set up for each completed machine-file job combination.

I have made a mock-up test of the equation solving by producing the log-file by wrapping each job in a script that measures the time etc, and then a small perl script to produce the equation system for solving with e.g. octave. It produces machine performance and file difficulty values which look reasonable. If interesting, I'll be happy to share the script etc.

Cheers,
Håkan

reply via email to

[Prev in Thread] Current Thread [Next in Thread]