help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Performance optimization (allocation inside a for loop)


From: Jaroslav Hajek
Subject: Re: Performance optimization (allocation inside a for loop)
Date: Tue, 7 Apr 2009 07:25:02 +0200

On Sat, Apr 4, 2009 at 8:47 AM, Jaroslav Hajek <address@hidden> wrote:
> On Thu, Apr 2, 2009 at 10:06 PM, John W. Eaton <address@hidden> wrote:
>> On  2-Apr-2009, r wrote:
>>
>> | I think chunks of ~5% of the structure size (perhaps more for smaller
>> | arrays) would be affordable by anyone and would effectively fix these
>> | allocation issues in practical applications. Whether it is worth the
>> | effort is a different question.
>>
>> If you think this is important, how about submitting a patch?
>>
>> jwe
>>
>
> I gave the idea a second thought - maybe it has some merits. Sometimes
> you really want to use an array as a stack in Octave to avoid going
> through a loop twice, so Octave can try to optimize such usage.
>
> For everyone interested, try the attached patch.
>
> Summary: The operations
>
> array(end+1) = something (push)
>
> and
>
> array(end) = []  (pop)
>
> are optimized (the usage of "end" is not necessary), reusing the
> existing mechanism for shallow slices.
> When a "push" operation is detected, existing over-allocated space
> will be reused if possible, otherwise a new array will be
> over-allocated twice or by 1024 elements, whichever is smaller. The
> "pop" operation simply adjusts the array size.
> Since the existing shallow slicing mechanism is reused, no extra data
> is introduced. Also, economization will happen as with orphaned
> shallow slices, so that the array will be automatically stripped off
> any possible over-allocated space when you store it into a variable,
> cell or struct element, or return it from a function.
> Hence the dangling memory will normally only exist locally.
> To facilitate this change, economization no longer happens on every
> assignment (which is probably not important anyway).
>
> Here's a simple benchmark:
>
> tic;
> for i = 1:1e5
>  a(i) = i;
> endfor
> toc
> a = [];
> tic;
> for i = 1:1e5
>  a(i) = i;
>  a(i) = [];
>  a(i) = i+1;
> endfor
> toc
>
> with recent tip:
>
> Elapsed time is 11.0845 seconds.
> Elapsed time is 16.2406 seconds.
>
> with current patch:
>
> Elapsed time is 0.380752 seconds.
> Elapsed time is 1.01877 seconds.
>
> a final question: maybe the maximum over-allocation size should be 
> configurable?
>
> enjoy!
>

Since noone else commented, I pushed the changeset.

regards

-- 
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz



reply via email to

[Prev in Thread] Current Thread [Next in Thread]