[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Multithreaded Atlas can be used by GCC-4.3.3-sjlj-mingw-TDM (Was Re:
From: |
Tatsuro MATSUOKA |
Subject: |
Re: Multithreaded Atlas can be used by GCC-4.3.3-sjlj-mingw-TDM (Was Re: Sjlj vs dwarf2 on mingw for octave) |
Date: |
Mon, 30 Mar 2009 18:43:54 +0900 (JST) |
Hello
> If I use dw2 version of gcc-4.3.3, griddata3.m cannot be executed due to the
> error of pthread
> mutex.
> (I do not remember in detail.)
I found the comment on Octave thread in Japan.
assertion !pthread_mutex_lock ( &(ROOT->mutex) ) failed, line 74 of file
/home/atlas/atlas3.8.2-gcc4.3.3-2/../ATLAS3.8.2
//src/pthreads/misc/ATL_signal_tree.c
At that time I used ATLAS3.8.2 but the same error happens at the case of
ATLAS3.8.3.
This is perhaps because pthread-win32 is made for gcc witb sjlj-EH but I do not
have any evidence
for that.
Regards
Tatsuro
--- Tatsuro MATSUOKA wrote:
> Hello
>
> I wrote prevously
> > Sjlj octave-3.0.4RC7
> > octave.exe:2> testOregoB
> > ans = 1.5469
> >
> > Dwarfs octave-3.0.4RC5
> > octave:3> testOregoB
> > ans = 1.1875
>
> The speed of octave interpreter that built by dw2-eH is 25-30% faster than
> that by sjlj-EH on
> mingw-GCC-4.3.3-TDM.
>
> However I found that the merit of sjlj EH on the view point of multithreaded
> ATLAS.
>
> By gcc-4.3.3 mingw (sjlj), I found that multithreaded atlas can be used,
> without error by
> configured
> with the following
>
> ../../octave-3.0.4RC7/configure --prefix=/c/Programs/octave-3.0.4RC7s
> --with-blas=--with-blas='-lptf77blas -latlas -lpthread'
>
> d:\usr\Tatsu\mingwhome\octaves\octave-3.0.4RC7\scripts/geometry\griddata3.m
> PASS 2/2
>
> If I use dw2 version of gcc-4.3.3, griddata3.m cannot be executed due to the
> error of pthread
> mutex.
> (I do not remember in detail.)
>
> The speed check by
> n=2000; A=randn(n); B=randn(n);tic; C=A*B; t=toc, MFLOPS=2*n^3/t*1e-6
>
> MFLOPS = 4970.9 (Ht-Pentium-4 prescott 3.4GHz)
>
> On the other hand single thread ATLAS by GCC-4.3.3-dw2 on the same computer,
> n=2000; A=randn(n); B=randn(n);tic; C=A*B; t=toc, MFLOPS=2*n^3/t*1e-6
>
> MFLOPS = 4830.2
>
> The multithreaded atlas is a little bit faster than the single thread atlas
> even though
> one physical core with hyper-threading.
> (On GotoBLAS with smp on for my computer is slower than that by single
> thread.)
>
> If the multi-thread atlas to Core2Duo and so on, the merit of sjlj EH may
> exist.
>
> Of course, the data should be accumulated for other computers.
> I will distribute the dependency library kit for mingw in thre near future on
> my web.
>
> Similar kind of experiments on Unix are also to be useful for users who would
> like to carry out
> fast
> Matrix calculation of large n (n x n).
>
> Regards
>
> Tatsuro
>
--------------------------------------
Power up the Internet with Yahoo! Toolbar.
http://pr.mail.yahoo.co.jp/toolbar/