help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multicore process - rewriting a for-loop


From: c.
Subject: Re: Multicore process - rewriting a for-loop
Date: Mon, 21 Jan 2013 12:41:01 +0100

lyvic,

A few small additions to Ismael's example:

On 21 Jan 2013, at 11:37, Ismael Diego Nunez-Riboni <address@hidden> wrote:

> Hi Max, I also tried once the multicore package but could not really get too 
> far with it and changed to the general package (general-1.3.0), which works 
> fine for me. I use the function PARCELLFUN. So, unless somebody else in the 
> list helps you with the multicore package, I suggest to use parcellfun from 
> the general package. Here you just have to divide your input vector in 
> various vectors, equally long (as much vectors as processes you want to use) 
> and then call parcellfun. Here is an example (you must save your input 
> vectors in an cell array, in the example is called INPUTVECT):
> 
> ------------------
>      % Dividing the input vectors in NOP *equally* long vectors:
> 
>      le = floor(length(YOURINPUTDATA)./NOP);
> 

The following section of your example:

>      % Building the parallel input vectors:
>      for aa = 1:NOP
>         INPUTVECT{aa}= YOURINPUTDATA([(aa-1)*le+1 : aa*le]);
>      end

can be vectorized using "mat2cell" (assuming "YOURINPUTDATA" is a row vector):

 INPUTVECT= mat2cell (YOURINPUTDATA, le*ones(NOP,1), 1);

>      % Calling the function in parallel! :
> 
>      YOUROUTPUT  = parcellfun (NOP, @YOURFUNCTION, INPUTVECT);

If "YOURINPUTDATA" is an array actually the whole example above is equivalent to

 YOUROUTPUT  = pararrayfun (NOP, @YOURFUNCTION, INPUTVECT);

> -------------------
> 
> NOP=Number of processes to use. Do not forget the @ next to the function's 
> name! It's easy and works like a charm, the hardest part is to divide your 
> input data in equally length vectors... Some notes:
> 
> 1) After calling the function you will have to join the various output 
> vectors in one.

Most of the time if you want to "rejoin" the result it will be sufficient to 
add the option: 
 
   parcellfun (…, "uniformoutput", true);

> 2) If you have a "rest vector" simply call the function with this rest vector 
> as input in a single core. Append the rest vector to your output vector.

"pararrayfun" should also take care of this automatically

> 3) Instead of passing the data itself to YOURFUNCTION you can define the data 
> as "global" and then pass only the indices of the data you want to process in 
> each core... I believe this should be faster.

I'm not sure whethwer this trick will produce any improvement at all, 
"parcellfun" works by spawning as many copies of the Octave process as required
by its first parameter, so global data will also need to be copied when "fork" 
is invoked.

> 4) Probably you already know this but I had to learn it the hard way: 
> sometimes you make your program to run faster by simply coding more 
> efficiently than by using many cores. Check first if you can reduce for loops 
> to matrix algebra, double for loops to single loops, etc.

+1 on this suggestion, the following link may be useful for this purpose:
  
http://www.gnu.org/software/octave/doc/interpreter/Vectorization-and-Faster-Code-Execution.html#Vectorization-and-Faster-Code-Execution
  

> I hope this help.


HTH,
c.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]