[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: parallel 4.0.1 released
Re: parallel 4.0.1 released
Tue, 30 Mar 2021 17:38:17 +0200
On Tue, Mar 23, 2021 at 02:28:01PM -0500, sshah wrote:
> Sure Olaf.
> I run code that calls parrayfun(ncpu, ..) about 750 times, so the first time
> effect is greatly reduced. ncpu is picked up from nproc(). I normally run
> this code on my in-house mac OS machines to test out integrity of changes
> after any updates to octave and its packages. I also run this to test out
> Amazon AWS instance configurations. The underlying functions that run in
> parallel use relatively low memory, so the memory pressure is low.
> I can also see cpu / core usage in top or Activity monitor.
> Here are some numbers.
> Apple M1, Big Sur, under Rosetta2, my test code ran in about 14 seconds, in
> Octave 6.2 and parallel 4.0.0.
> On Apple Mac Book Pro 2017 2.3 Ghz Big Sur,, octave 6.2 and parallel 4.0.0
> it runs in about 26 seconds.
> On m5zn.6xlarge, ubuntu 20.04 LTS, 12 cores second Gen Cascade lake, Octave
> 5.2 and parallel 3.1.3, it ran in 15 seconds.
> After update of parallel 4.0.0 to 4.0.1 on Apple M1 Big Sur, under Roestta2,
> in Octave 6.2 it ran in 48 seconds.
> However, I did not have a run just before I upgraded the parallel package on
> M1. I could downgrade parallel to 4.0.0 and retest on M1. Perhaps you can
> help me through the steps. For some reason, I have lost the link to
> instructions for creating local tar ball from it so I can do a local install
> from it on M1. Can you please post those instructions?
Sorry, I couldn't attend to this for the last days.
As for the instructions:
After changing current directory to the one you cloned 'parallel'
rm -r target # but make sure you have nothing valuable there
export PREBUILD=no # makes the next step faster
Than e.g. change directory into 'target', start Octave and
pkg install parallel-4.0.0.tar.gz
But if you have installed parallel-4.0.1.tar.gz in a different way,
you should probably uninstall it first...
As for the original issue:
Execution speed depends on many things, I can't help much without
having a reproducable example (and I have no McOS). There were changes
from 4.0.0 to 4.0.1 (passing of the current search path) which might
make a small difference in some cases, but only at the start of
parallel execution (i.e. at the start of a call of parcellfun()).
Maybe seeing your test code would help. You should call parcellfun
with 'ncpu' before performing your test to get rid of the 'first time
effect' in your test. Generally, parcellfun is probably not very good
for problems which involve repetitive calls (750 in your test) of
which each one lasts only a short time (0.064 s in your test with
4.0.1). In such a case, the small difference between 4.0.0 and 4.0.1
might have a noticable effect.
One further thought: Have you possibly made the mistake to call
parcellfun_set_nproc(0) before or after each call to parcellfun? (This
shouldn't be done, normally.)
public key id EAFE0591, e.g. on x-hkp://pool.sks-keyservers.net
Description: PGP signature