Questions tagged [blas]

The Basic Linear Algebra Subprograms are a standard set of interfaces for low-level vector and matrix operations commonly used in scientific computing.

blas
Filter by
Sorted by
Tagged with
194 votes
4 answers
52k views

What is the relation between BLAS, LAPACK and ATLAS

I don't understand how BLAS, LAPACK and ATLAS are related and how I should use them together! I have been looking through all of their manuals and I have a general idea of BLAS and LAPACK and how to ...
makhlaghi's user avatar
  • 3,946
155 votes
8 answers
61k views

How does BLAS get such extreme performance?

Out of curiosity I decided to benchmark my own matrix multiplication function versus the BLAS implementation... I was to say the least surprised at the result: Custom Implementation, 10 trials of ...
DeusAduro's user avatar
  • 6,071
141 votes
5 answers
95k views

How to check BLAS/LAPACK linkage in NumPy and SciPy?

I am builing my numpy/scipy environment based on blas and lapack more or less based on this walk through. When I am done, how can I check, that my numpy/scipy functions really do use the previously ...
Woltan's user avatar
  • 13.8k
140 votes
3 answers
49k views

Why does multiprocessing use only a single core after I import numpy?

I am not sure whether this counts more as an OS issue, but I thought I would ask here in case anyone has some insight from the Python end of things. I've been trying to parallelise a CPU-heavy for ...
ali_m's user avatar
  • 73.1k
119 votes
5 answers
47k views

Benchmarking (python vs. c++ using BLAS) and (numpy)

I would like to write a program that makes extensive use of BLAS and LAPACK linear algebra functionalities. Since performance is an issue I did some benchmarking and would like know, if the approach I ...
Woltan's user avatar
  • 13.8k
86 votes
2 answers
3k views

Is armadillo solve() thread safe?

In my code I have loop in which I construct and over determined linear system and try to solve it: #pragma omp parallel for for (int i = 0; i < n[0]+1; i++) { for (int j = 0; j < n[1]+1; j++...
maxdebayser's user avatar
  • 1,066
83 votes
10 answers
67k views

MatLab error: cannot open with static TLS

Since a couple of days, I constantly receive the same error while using MATLAB which happens at some point with dlopen. I am pretty new to MATLAB, and that is why I don't know what to do. Google doesn'...
Hans Meyer's user avatar
76 votes
16 answers
64k views

TensorFlow: InternalError: Blas SGEMM launch failed

When I run sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) I get InternalError: Blas SGEMM launch failed. Here is the full error and stack trace: InternalErrorTraceback (most recent call ...
rafaelcosman's user avatar
  • 2,579
68 votes
1 answer
3k views

Distributing Cython based extensions using LAPACK

I am writing a Python module that includes Cython extensions and uses LAPACK (and BLAS). I am open to using either clapack or lapacke, or some kind of f2c or f2py solution if necessary. What is ...
jcrudy's user avatar
  • 4,011
66 votes
20 answers
78k views

TensorFlow: Blas GEMM launch failed

When I'm trying to use TensorFlow with Keras using the gpu, I'm getting this error message: C:\Users\nicol\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\__main__.py:2: UserWarning: Update ...
Nicolas's user avatar
  • 709
57 votes
3 answers
14k views

Purpose of LDA argument in BLAS dgemm?

The Fortran reference implementation documentation states: * LDA - INTEGER. * On entry, LDA specifies the first dimension of A as declared * in the calling (sub) program. When ...
Setjmp's user avatar
  • 27.8k
54 votes
3 answers
49k views

Compiling numpy with OpenBLAS integration

I am trying to install numpy with OpenBLAS , however I am at loss as to how the site.cfg file needs to be written. When the installation procedure was followed the installation completed without ...
Vijay's user avatar
  • 849
50 votes
4 answers
36k views

multithreaded blas in python/numpy

I am trying to implement a large number of matrix-matrix multiplications in Python. Initially, I assumed that NumPy would use automatically my threaded BLAS libraries since I built it against those ...
Lucas's user avatar
  • 938
46 votes
1 answer
7k views

Keras not using multiple cores

Based on the famous check_blas.py script, I wrote this one to check that theano can in fact use multiple cores: import os os.environ['MKL_NUM_THREADS'] = '8' os.environ['GOTO_NUM_THREADS'] = '8' os....
Herbert's user avatar
  • 5,431
40 votes
3 answers
36k views

Find out if/which BLAS library is used by Numpy

I use numpy and scipy in different environments (MacOS, Ubuntu, RedHat). Usually I install numpy by using the package manager that is available (e.g., mac ports, apt, yum). However, if you don't ...
Apoptose's user avatar
  • 579
39 votes
6 answers
13k views

Linking Intel's Math Kernel Library (MKL) to R on Windows

Using an alternative BLAS for R has several advantages, see e.g. https://cran.r-project.org/web/packages/gcbd/vignettes/gcbd.pdf. Microsoft R Open https://mran.revolutionanalytics.com/documents/rro/...
majom's user avatar
  • 7,911
28 votes
3 answers
14k views

BLAS: gemm vs. gemv

Why does BLAS have a gemm function for matrix-matrix multiplication and a separate gemv function for matrix-vector multiplication? Isn't matrix-vector multiplication just a special case of matrix-...
dsimcha's user avatar
  • 68k
28 votes
2 answers
9k views

Multi-threaded integer matrix multiplication in NumPy/SciPy

Doing something like import numpy as np a = np.random.rand(10**4, 10**4) b = np.dot(a, a) uses multiple cores, and it runs nicely. The elements in a, though, are 64-bit floats (or 32-bit in 32-bit ...
étale-cohomology's user avatar
27 votes
5 answers
6k views

Fastest way to negate a std::vector

Assume I have a std::vector of double, namely std::vector<double> MyVec(N); Where N is so big that performance matters. Now assume that MyVec is a nontrivial vector (i.e. it is not a vector of ...
enanone's user avatar
  • 949
27 votes
6 answers
7k views

Running Scipy on Heroku

I got Numpy and Matplotlib running on Heroku, and I'm trying to install Scipy as well. However, Scipy requires BLAS[1] to install, which is not presented on the Heroku platform. After contacting ...
Joseph Chang's user avatar
27 votes
3 answers
8k views

calling dot products and linear algebra operations in Cython?

I'm trying to use dot products, matrix inversion and other basic linear algebra operations that are available in numpy from Cython. Functions like numpy.linalg.inv (inversion), numpy.dot (dot product),...
user avatar
26 votes
4 answers
10k views

R detection of Blas version

Is there a way of detecting the version of BLAS that R is using from inside R? I am using Ubuntu, and I have a couple of BLAS versions installed - I just don't know which one is "active" from R's ...
Sean's user avatar
  • 3,885
24 votes
2 answers
21k views

Link ATLAS/MKL to an installed Numpy

TL;DR how to link ATLAS/MKL to existing Numpy without rebuilding. I have used Numpy to calculate with the large matrix and I found that it is very slow because Numpy only use 1 core to do calculation....
tndoan's user avatar
  • 653
24 votes
5 answers
27k views

How to make sure the numpy BLAS libraries are available as dynamically-loadable libraries?

The theano installation documentation states, that theano will as a default use the BLAS libraries from numpy, if the "BLAS libraries are available as dynamically-loadable libraries". This seems not ...
Framester's user avatar
  • 34.3k
24 votes
1 answer
2k views

Implementing faster python inner product with BLAS

I found this useful tutorial on using low-level BLAS functions (implemented in Cython) to get big speed improvements over standard numpy linear algebra routines in python. Now, I've successfully ...
moustachio's user avatar
  • 2,954
22 votes
1 answer
2k views

Replicating BLAS matrix multiplication performance: Can I match it?

Background If you have been following my posts, I am attempting to replicate the results found in Kazushige Goto's seminal paper on square matrix multiplication C = AB. My last post regarding this ...
matmul's user avatar
  • 589
21 votes
1 answer
38k views

Installing LAPACK and BLAS Libraries for C on Mac OS

I wanted instructions/websites from where I could download LAPACK and BLAS libraries for use in my C programs. I also wanted to know how I could link these to the gcc compiler from terminal.
204's user avatar
  • 473
19 votes
4 answers
15k views

Element-wise vector-vector multiplication in BLAS?

Is there a means to do element-wise vector-vector multiplication with BLAS, GSL or any other high performance library ?
Tarek's user avatar
  • 1,070
18 votes
2 answers
13k views

Numpy, BLAS and CUBLAS

Numpy can be "linked/compiled" against different BLAS implementations (MKL, ACML, ATLAS, GotoBlas, etc). That's not always straightforward to configure but it is possible. Is it also possible to "...
Ümit's user avatar
  • 17.4k
18 votes
2 answers
5k views

Mystified by qr.Q(): what is an orthonormal matrix in "compact" form?

R has a qr() function, which performs QR decomposition using either LINPACK or LAPACK (in my experience, the latter is 5% faster). The main object returned is a matrix "qr" that contains in the upper ...
gappy's user avatar
  • 10.2k
18 votes
4 answers
20k views

What is the BigO of linear regression?

How large a system is it reasonable to attempt to do a linear regression on? Specifically: I have a system with ~300K sample points and ~1200 linear terms. Is this computationally feasible?
BCS's user avatar
  • 76.9k
18 votes
1 answer
30k views

how to check if BLAS and ATLAS already installed

I'm trying to install armadillo library onto my linux system(ubuntu 12.04). The BOOST BLAS ATLAS and LAPACK is required first for the installation. Is there a way to check if those libraries are ...
lolibility's user avatar
  • 2,187
17 votes
2 answers
12k views

Set max number of threads at runtime on numpy/openblas

I'd like to know if it's possible to change at (Python) runtime the maximum number of threads used by OpenBLAS behind numpy? I know it's possible to set it before running the interpreter through the ...
Théo T's user avatar
  • 3,320
17 votes
3 answers
6k views

Why does MATLAB/Octave wipe the floor with C++ in Eigenvalue Problems?

I'm hoping that the answer to the question in the title is that I'm doing something stupid! Here is the problem. I want to compute all the eigenvalues and eigenvectors of a real, symmetric matrix. I ...
MGA's user avatar
  • 1,668
16 votes
4 answers
16k views

What is a good free (open source) BLAS/LAPACK library for .net (C#)? [closed]

I have a project written in C# where I need to do various linear algebraic operations on matrices (like LU-factorization). Since the program is mainly a prototype created to confirm a theory, a C# ...
Egil Hansen's user avatar
  • 15.3k
16 votes
2 answers
2k views

No speedup for vector sums with threading

I have a C++ program which basically performs some matrix calculations. For these I use LAPACK/BLAS and usually link to the MKL or ACML depending on the platform. A lot of these matrix calculations ...
Fabian's user avatar
  • 173
16 votes
1 answer
14k views

Fortran 90/95 library for sparse matrices?

I am looking for a library for dealing with sparse matrices in fortran 90/95. I only need very basic operations like matrix-vector multiplication. What do you suggest I use? I have searched around ...
arne's user avatar
  • 707
16 votes
2 answers
7k views

Any good documentation for the cblas interface? [closed]

Can someone recommend a good reference or tutorial for the cblas interface? Nothing comes up on google, all of the man pages I've found are for the fortran blas interface, and the pdf that came with ...
Andrew Wagner's user avatar
15 votes
3 answers
13k views

How to use numpy with OpenBLAS instead of Atlas in Ubuntu?

I have looked for an easy way to install/compile Numpy with OpenBLAS but didn't find an easy answer. All the documentation I have seen takes too much knowledge as granted for someone like me who is ...
pierolefou's user avatar
15 votes
3 answers
10k views

Theano CNN on CPU: AbstractConv2d Theano optimization failed

I'm trying to train a CNN for object detection on images with the CIFAR10 dataset for a seminar at my university but I get the following Error: AssertionError: AbstractConv2d Theano optimization ...
Jonasson's user avatar
  • 293
15 votes
1 answer
2k views

cholesky decomposition ScaLapack error

I'm getting the following error and i'm not sure why. { 1, 1}: On entry to PDPOTRF parameter number 2 had an illegal value { 1, 0}: On entry to PDPOTRF parameter number 2 had an ...
pyCthon's user avatar
  • 12k
14 votes
2 answers
20k views

"Attempting to perform BLAS operation using StreamExecutor without BLAS support" error occurs

my computer has only 1 GPU. Below is what I get the result by entering someone's code [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality {} incarnation: ...
MCPMH's user avatar
  • 231
14 votes
1 answer
5k views

Why are there no BLAS routines for addition and subtraction

In BLAS there are routines like dscal scale a vector by a constant dinit initialize a vector with given value daxpy perform y = a*x + y and so on. But there are apparently no routines ...
Andreas H.'s user avatar
  • 5,743
14 votes
1 answer
535 views

Faster evaluation of matrix multiplication from right to left

I noticed that evaluating matrix operations in quadratic form from right to left is significantly faster than left to right in R, depending on how the parentheses are placed. Obviously they both ...
Taotao Tan's user avatar
13 votes
4 answers
23k views

Numpy Pure Functions for performance, caching

I'm writing some moderately performance critical code in numpy. This code will be in the inner most loop, of a computation that's run time is measured in hours. A quick calculation suggest that this ...
Frames Catherine White's user avatar
13 votes
3 answers
4k views

Is it possible to switch between BLAS libraries without recompiling program?

For example can I have Atlas, OpenBlas, MKL installed on my Ubuntu 14.04 at the same time and switch between them without recompiling Caffe?
mrgloom's user avatar
  • 20.8k
12 votes
5 answers
18k views

libgfortran: version `GFORTRAN_1.4' not found

I am getting the following error when I trying to a run mex file in MATLAB: ??? Invalid MEX-file 'findimps3.mexa64': /MATLAB/bin/glnxa64/../../sys/os/glnxa64/libgfortran.so.3: version `GFORTRAN_1.4' ...
Mohammad Moghimi's user avatar
12 votes
3 answers
1k views

How can I determine which matrix libraries my R install is using?

I am having a matrix error when using the computer cluster at my university that I cannot reproduce on my local machine. I think it might be due to a difference of matrix libraries (BLAS, LAPACK, ...
rcorty's user avatar
  • 1,160
12 votes
1 answer
3k views

Floating point math in python / numpy not reproducible across machines

Comparing the results of a floating point computation across a couple of different machines, they are consistently producing different results. Here is a stripped down example that reproduces the ...
Urs's user avatar
  • 705
12 votes
1 answer
3k views

performance of NumPy with different BLAS implementations

I'm running an algorithm that is implemented in Python and uses NumPy. The most computationally expensive part of the algorithm involves solving a set of linear systems (i.e. a call to numpy.linalg....
lum's user avatar
  • 1,543

1
2 3 4 5
19