|
From: | Paul Thomas |
Subject: | Re: Slowup in 2.1.54 |
Date: | Thu, 19 Feb 2004 07:03:11 +0100 |
User-agent: | Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030225 |
David,Yes, I am very impressed by your efforts too. The FFT, SORT and RAND upgrades that you all have put together are superb.
The point that I was making, in effect, was that the benchmarks should be "optimised" in each of the languages that are being compared, otherwise they are meaningless. For example, some langauges might use the Standard Library strategy allocating more space than is called for, when constructing a new container. Octave does this sometimes; eg. j = 1e4; tic;x = []; for i = 1:j; (i) = i; endfor; toc changes its spots altogether when j>1e4 (I haven't checked in the source but I presume that a reserve of about 8e4 bytes is being allocated?).
Using the [] operator to build up arrays is bad news in Matlab and octave for reasons that you know and understand. I have no idea what hidden wrinkles there are in the other languages in the benchmark. That was the first thing that occurred to me when I first saw the benchmarks.
Regards Paul David Bateman wrote:
Paul, Basically I know you shouldn't write code in the way I used to test the problem and so this might not pratically have result in any problems (expect for sylvester_matrix ). My main concern recently has been to try and improve the benchmarked performance of Octave, and one benchmark that Octave performed very badly on can be found at http://www.sciviews.org/other/benchmark.htm As the author of these benchmarks states, some of these tests have been written badly to torture test the interpreter. This is particular true of the last two tests where "tic; x = []; for i=1:1e3; x = [x, i]; endfor; toc" is a fair summary of one of them. So since this is one of the metric for their test it would have been a pity to see a factor of 10 slow-up in Octave. In any case this benchmark pretty much explains much of my recent contributions. The stuff I did on sort and randn recently in octave-forge, also the eigenvalue patch that doesn't calculate eignevectors if you don't want them, and the FFT stuff. With these changes and the fact the 2.1.42 didn't properly use LAPACK/ATLAS, where Octave does now, I estimate the final total in this benchmark for the CVS version of octave to have come down from 27.76 to about 15 or 16. Whether this reflects the speedup a real user will see is more questionable however.... Regards David Daprès Paul Thomas <address@hidden> (le 18/02/2004):David and Dmitri,Whilst it is a bit at right angles to the discussion because the fault with the [] operator was fixed, should we really worry about constructs likej = 1e 4; tic; x = []; for i=1:j; x = [x, i]; endfor; tocthat have square law scaling with the length of i?j = 1e4; tic; x = zeros(1, j); for i = 1:j; x(i) = i; endfor; toc is 30 times faster, at j=1e4, and is linear in j. Dmitri A. Sergatskov wrote:David Bateman wrote:The gcd stuff might just be a timing error. The code used for the timing in the sciview tests is pretty rough, using tic/toc. I've replaced this in my version using cputime instead. and get the followcputime does not work with pthreads, so I have to use tic/toc. Here is another example of recursion slowdown: 2.1.53: tic; for n=1:1000; bm_x=sylvester_matrix(7) ; endfor ; toc ans = 4.2025 2.1.54: tic; for n=1:1000; bm_x=sylvester_matrix(7) ; endfor ; toc ans = 11.574 (those are both stock releases no patches for SMP/pthreads etc...; --enable-shared --disable-static ; -O3 -march=athlon-mp; AthlonMP x2)So the only significant slow up I see is in the last test. Here is anotherI do not trust tic/toc numbers less then 0.1 sec. So I increased the index:interting set of tests 2.1.50 tic; x = []; for i=1:1e3; x = [x, i]; endfor; toc ans = 0.079843 tic; x=0; for i=1:1e3; x++; endfor; toc ans = 0.0189202.1.53: tic; x = []; for i=1:1e4; x = [x, i]; endfor; toc ans = 3.2951 2.1.54: tic; x = []; for i=1:1e4; x = [x, i]; endfor; toc ans = 20.234 2.1.53: tic; x=0; for i=1:1e4; x++; endfor; toc ans = 0.024073 tic; x=0; for i=1:1e5; x++; endfor; toc ans = 0.24130 tic; x=0; for i=1:1e6; x++; endfor; toc ans = 2.3381 2.1.54: tic; x=0; for i=1:1e4; x++; endfor; toc ans = 0.028044 tic; x=0; for i=1:1e5; x++; endfor; toc ans = 0.26263 tic; x=0; for i=1:1e6; x++; endfor; toc ans = 2.6506 (I do not understand it, but my old records show that loops were slower, it seems that something else in my system changed that speeded it up.)Regards DavidSincerely, Dmitri.
[Prev in Thread] | Current Thread | [Next in Thread] |