[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: implement a SOR like routine efficiently
From: |
Francesco Potorti` |
Subject: |
Re: implement a SOR like routine efficiently |
Date: |
Tue, 16 Nov 2004 07:38:34 -0600 |
>I need to implement the following loop efficiently:
>
>for i = 2:n-1
> for j = 2:n-1
> a(i,j) = a(i-1,j) + a(i+1,j) + a(i,j+1) + a(i,j-1)
> end
>end
>
>The loop is not vectorizable, as it has a carried dependence.
Are you sure you in fact want the carried dependence?
If not, it should be easily vectorizable like this:
a(2:n-1,2:n-1)=a(1:n-2,2:n-1)+a(3:n,2:n-1)+a(2:n-1,1:n-2)+a(2:n-1,3:n)
If yes, you can vectorize the columns by noting that:
a(2,j)=a(2,j-1)+a(2,j+1)+a(1,j)+a(3,j)
a(3,j)=a(3,j-2)+a(3,j+2)+a(1,j)+a(3,j)+a(4,j)
a(4,j)=a(4,j-3)+a(4,j+3)+a(1,j)+a(3,j)+a(4,j)+a(5,j)
a(5,j)=a(5,j-4)+a(5,j+4)+a(1,j)+a(3,j)+a(4,j)+a(5,j)+a(6,j)
a(6,j)=a(6,j-5)+a(6,j+5)+a(1,j)+a(3,j)+a(4,j)+a(5,j)+a(6,j)+a(7,j)
that is:
a(2:n-1,j)=a(2:n-1,j-1)+a(2:n-1,j+1)+cumsum(a(3:n,j))
This way you need a single loop on the columns, which should be a big
improvement. Maybe, by similar reasoning, you could even vectorise it
row-wise and avoid all loops, but I did not try.
Please note that all the above is untested, so I may have made mistakes
both in the reasoning and in the implementation.
--
Francesco Potortì (ricercatore) Voice: +39 050 315 3058 (op.2111)
ISTI - Area della ricerca CNR Fax: +39 050 313 8091
via G. Moruzzi 1, I-56124 Pisa Email: address@hidden
Web: http://fly.isti.cnr.it/ Key: fly.isti.cnr.it/public.key
-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.
Octave's home on the web: http://www.octave.org
How to fund new projects: http://www.octave.org/funding.html
Subscription information: http://www.octave.org/archive.html
-------------------------------------------------------------