[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Performance optimization (allocation inside a for loop)
From: |
Jaroslav Hajek |
Subject: |
Re: Performance optimization (allocation inside a for loop) |
Date: |
Tue, 7 Apr 2009 07:25:02 +0200 |
On Sat, Apr 4, 2009 at 8:47 AM, Jaroslav Hajek <address@hidden> wrote:
> On Thu, Apr 2, 2009 at 10:06 PM, John W. Eaton <address@hidden> wrote:
>> On 2-Apr-2009, r wrote:
>>
>> | I think chunks of ~5% of the structure size (perhaps more for smaller
>> | arrays) would be affordable by anyone and would effectively fix these
>> | allocation issues in practical applications. Whether it is worth the
>> | effort is a different question.
>>
>> If you think this is important, how about submitting a patch?
>>
>> jwe
>>
>
> I gave the idea a second thought - maybe it has some merits. Sometimes
> you really want to use an array as a stack in Octave to avoid going
> through a loop twice, so Octave can try to optimize such usage.
>
> For everyone interested, try the attached patch.
>
> Summary: The operations
>
> array(end+1) = something (push)
>
> and
>
> array(end) = [] (pop)
>
> are optimized (the usage of "end" is not necessary), reusing the
> existing mechanism for shallow slices.
> When a "push" operation is detected, existing over-allocated space
> will be reused if possible, otherwise a new array will be
> over-allocated twice or by 1024 elements, whichever is smaller. The
> "pop" operation simply adjusts the array size.
> Since the existing shallow slicing mechanism is reused, no extra data
> is introduced. Also, economization will happen as with orphaned
> shallow slices, so that the array will be automatically stripped off
> any possible over-allocated space when you store it into a variable,
> cell or struct element, or return it from a function.
> Hence the dangling memory will normally only exist locally.
> To facilitate this change, economization no longer happens on every
> assignment (which is probably not important anyway).
>
> Here's a simple benchmark:
>
> tic;
> for i = 1:1e5
> a(i) = i;
> endfor
> toc
> a = [];
> tic;
> for i = 1:1e5
> a(i) = i;
> a(i) = [];
> a(i) = i+1;
> endfor
> toc
>
> with recent tip:
>
> Elapsed time is 11.0845 seconds.
> Elapsed time is 16.2406 seconds.
>
> with current patch:
>
> Elapsed time is 0.380752 seconds.
> Elapsed time is 1.01877 seconds.
>
> a final question: maybe the maximum over-allocation size should be
> configurable?
>
> enjoy!
>
Since noone else commented, I pushed the changeset.
regards
--
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz
- Re: Performance optimization (allocation inside a for loop), (continued)
- Re: Performance optimization (allocation inside a for loop), Francesco Potorti`, 2009/04/02
- Re: Performance optimization (allocation inside a for loop), r, 2009/04/02
- Re: Performance optimization (allocation inside a for loop), John W. Eaton, 2009/04/02
- Re: Performance optimization (allocation inside a for loop), r, 2009/04/02
- Re: Performance optimization (allocation inside a for loop), Jaroslav Hajek, 2009/04/04
- Re: Performance optimization (allocation inside a for loop), r, 2009/04/04
- Re: Performance optimization (allocation inside a for loop), Jaroslav Hajek, 2009/04/04
- Re: Performance optimization (allocation inside a for loop), Jaroslav Hajek, 2009/04/04
- Re: Performance optimization (allocation inside a for loop),
Jaroslav Hajek <=
- Re: Performance optimization (allocation inside a for loop), Francesco Potorti`, 2009/04/07
- Re: Performance optimization (allocation inside a for loop), Jaroslav Hajek, 2009/04/08
- Re: Performance optimization (allocation inside a for loop), Francesco Potorti`, 2009/04/08
- Re: Performance optimization (allocation inside a for loop), Jaroslav Hajek, 2009/04/03