help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: indexing expression performance


From: Francesco Potortì
Subject: Re: indexing expression performance
Date: Tue, 13 Jan 2009 22:46:58 +0100

>I've noticed that indexing expressions on large arrays can be quite slow,

Have a look at this, which I wrote in November but for some reason does
not appear in the list archives.  In an application of mine, I wrote an
ad hoc function to access a 5-dim matrix using the last method below,
getting a significant speedup.  I usually run 3.0.1, and I did not make
any check to compare its speed with the more recent Octave versions.

|Date: 11 Nov 2008 10:44:45 +0100
|From: Francesco Potortì <address@hidden>
|To: Jaroslav Hajek <address@hidden>
|CC: Octave help list <address@hidden>
|Subject: Re: slow access to consecutive matrix elements
|
|>On Sat, Oct 11, 2008 at 10:22 AM, Francesco Potorti`
|><address@hidden> wrote:
|>> This is a real life example that demonstrates how access to matrix
|>> elements is not optimised for the case of complete ranges in the first
|>> dimensions.  I am sending it here and not to the bug list because this
|>> example may be useful to others, at least until this cases is optimised
|>> in the sources.
|>>
|>> # This is a big 5-dim matrix, about 100MB size
|>> octave> kk=rand(156,222,1,44,8);
|>>
|>> # I access a slice of it, which is sequential in memory
|>> octave> t=cputime; for ii=1:44, for jj=1:8
|>>  mm=kk(:,:,:,ii,jj); endfor, endfor, cputime-t
|>> ans =  5.9124
|>>
|>> # Using a linear index is much faster
|>> octave> span=(1:156*222);
|>>        t=cputime; for ii=1:44, for jj=1:8
|>>  mm=kk(sub2ind(size(kk),1,1,1,ii,jj)+span-1); endfor, endfor, cputime-t
|>> ans =  2.6642
|>>
|>> # Removing the call to sub2ind reaches top speed
|>> octave> cp=[1,cumprod(size(kk)(1:end-1))]; span=(1:156*222);
|>>        t=cputime; for ii=1:44, for jj=1:8
|>>  mm=kk(sum(([1,1,1,ii,jj]-1).*cp)+span); endfor, endfor, cputime-t
|>> ans =  1.4001
|>>
|>> Still, I wish access were faster yet.  Is there a reason why copying
|>> consecutive memory is so slow?  I wish I could help with optimising
|>> this, even if I am certainly not the most qualified person to do it.
|>>
|>
|>I guess the indexing routines do deserve some attention w.r.t
|>performance. Reducing code duplication would also be nice. I have this
|>on my TODO list, but I don't think it's a good idea to do it before
|>3.2.x is forked, as such changes are, IMO, likely to introduce bugs.
|
|Follwoing up on this, I realised that there is room for further
|significant speedup:
|
|# Be careful to not convert ranges into matrices
|octave> cp=[1,cumprod(size(kk)(1:end-1))]; len=156*222;
|      t=cputime; for ii=1:44, for jj=1:8
|base=sum(([1,1,1,ii,jj]-1).*cp); mm=kk(base+1:base+len); endfor, endfor, 
cputime-t
|ans =  0.26402
|
|The fact is, I was discounting the fact that a range remains a range
|even after linear transformation, while this does not appear to be the
|case:
|
|octave3.1> typeinfo(1:4)
|ans = range
|octave3.1> typeinfo(4+(1:4))
|ans = matrix
|octave3.1> typeinfo(4*(1:4))
|ans = matrix
|
|>From the depth of my ignorance about Octave's internals, I dare say that
|it should not be too difficult to keep ranges as ranges even after sum
|or product with a scalar.  Maybe even after sum with a range with the
|same number of elements.  Am I wrong?
|
|-- 
|Francesco Potortì (ricercatore)        Voice: +39 050 315 3058 (op.2111)
|ISTI - Area della ricerca CNR          Fax:   +39 050 315 2040
|via G. Moruzzi 1, I-56124 Pisa         Email: address@hidden
|(entrance 20, 1st floor, room C71)     Web:   http://fly.isti.cnr.it/
|
|


reply via email to

[Prev in Thread] Current Thread [Next in Thread]