|
From: | Ricardo Wurmus |
Subject: | Re: OpenBLAS and performance |
Date: | Wed, 20 Dec 2017 21:00:46 +0100 |
User-agent: | mu4e 1.0-alpha3; emacs 25.3.1 |
Pjotr Prins <address@hidden> writes: >> > If I compile for a target it >> > makes a large difference. >> >> The FAQ document[1] says this: >> >> The environment variable which control the kernel selection is >> OPENBLAS_CORETYPE (see driver/others/dynamic.c) e.g. export >> OPENBLAS_CORETYPE=Haswell. And the function char* >> openblas_get_corename() returns the used target. >> >> [1]: https://github.com/xianyi/OpenBLAS/wiki/Faq >> >> Have you tried this and compared the performance? > > About 10x difference on 24+ cores for matrix multiplication (my > version vs what comes with Guix). > > I do think we need to default to a conservative openblas for general > use. Question is how we make it fly on dedicated hardware. Have you tried preloading the special library with LD_PRELOAD? -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net
[Prev in Thread] | Current Thread | [Next in Thread] |