help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Octave 3.6.4 VS2010 and the C++ API


From: Michael Goffioul
Subject: Re: Octave 3.6.4 VS2010 and the C++ API
Date: Thu, 26 Sep 2013 12:50:43 -0400

On Thu, Sep 26, 2013 at 11:32 AM, Michael Goffioul <address@hidden> wrote:


---------- Forwarded message ----------
From: Michael Goffioul <address@hidden>
Date: Thu, Sep 26, 2013 at 11:31 AM
Subject: Re: Octave 3.6.4 VS2010 and the C++ API
To: Mike Puglia <address@hidden>





On Thu, Sep 26, 2013 at 8:04 AM, Mike Puglia <address@hidden> wrote:
Okay, SSE3 works for me.  Thanks again for all the help.

Mike


From: Michael Goffioul <address@hidden>
To: Mike Puglia <address@hidden>
Cc: "address@hidden" <address@hidden>
Sent: Thursday, September 26, 2013 8:15 PM

Subject: Re: Octave 3.6.4 VS2010 and the C++ API

On Thu, Sep 26, 2013 at 5:20 AM, Mike Puglia <address@hidden> wrote:
Thanks again for all the help.  I tried all of the libraries and the problem seems to be limited to OpenBLAS.  I've tried this on three machines now (running XP or Windows Server 2008), and so it seems to be universal. The other three BLAS libraries in the distribution (generic, Intel SSE3 and SSE3 multi-threaded) all calculated correctly, but with a hit to performance.  On the Intel libraries the performance hit was about 25% and the generic was 2x on my machine.  Most of my work involves running things that take on the order of 1 or 2 days, so this is pretty costly.

On a somewhat related note, I'm also having problems running OpenBLAS on Amazon EC2.  It seems that the virtual machines there don't support AVX, and so OpenBLAS is defaulting to a backup (Nehalem?) that can run without AVX, which causes a performance hit of about 2x.

Do you see any alternatives to using OpenBLAS here (ATLAS maybe?) or can you suggest a fix?  Any advice you have would be greatly appreciated.

I'll have to dig into OpenBLAS code and fix the issue, but I'm pretty sure this is a calling convention mismatch between GCC and MSVC. In the meantime, you can use the "Intel SSE3" or "Intel SSE3 multi-threaded", which are ATLAS versions. I don't have any other fix to suggest.

Michael.


For the record, I'm wondering whether the problem is not related to this bug fix (fixed in 0.2.4, but my binaries are using 0.2.2 or lower):



I can confirm that the above link is the actual problem. It's basically a calling convention problem, the ASM code of OpenBLAS assuming the default convention used in GCC < 4.7. So when compiling the C part of OpenBLAS with GCC >= 4.7, you ended up with a stack corruption. You're just lucky it didn't lead to a crash (which is usually what happens in such scenario).

The good news is that upgrading to OpenBLAS >= 0.2.4 should solve the problem. I'll try to produce a new OpenBLAS DLL in the coming days. I'll keep you posted.

Michael.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]