Questions tagged [pycuda]
PyCUDA is the Python module which provides a comprehensive pythonic interface to the NVIDIA CUDA GPU computing environment.
pycuda
411
questions
21
votes
2
answers
18k
views
What is the difference between PyCUDA and NumbaPro CUDA Python?
I'm new to CUDA and am trying to figure out whether PyCUDA (free) or NumbaPro CUDA Python (not free) would be better for me (assuming the library cost is not an issue).
Both seem to require that you ...
20
votes
5
answers
12k
views
pyCUDA vs C performance differences?
I'm new to CUDA programming and I was wondering how the performance of pyCUDA is compared to programs implemented in plain C.
Will the performance be roughly the same? Are there any bottle necks that ...
18
votes
2
answers
19k
views
Python Multiprocessing with PyCUDA
I've got a problem that I want to split across multiple CUDA devices, but I suspect my current system architecture is holding me back;
What I've set up is a GPU class, with functions that perform ...
14
votes
4
answers
5k
views
pyCUDA with Flask gives pycuda._driver.LogicError: cuModuleLoadDataEx
I want to run a pyCUDA code on a flask server. The file runs correctly directly using python3 but fails when the corresponding function is called using flask.
Here is the relevant code:
cudaFlask.py:...
14
votes
2
answers
562
views
No output after using PyCUDA
I've installed PyCUDA using pip. I tried this in two computers.
One with a fresh install of Python 3.7.1 and one with Python 3.6.5.
Everything fails after using PuCUDA with no error message.
The ...
10
votes
5
answers
20k
views
src/cpp/cuda.hpp:14:10: fatal error: cuda.h: No such file or directory
when I install pycuda by this instruction:
pip install pycuda
but there is an error:
src/cpp/cuda.hpp:14:10: fatal error: cuda.h: No such file or directory
but I have installed the cuda toolkit....
10
votes
3
answers
30k
views
trying to install pycuda, getting zip error?
windows 10, python 2.7 64 bit
hello, following a guide to this step :
pip install pipwin
pipwin install pycuda
gives me those options
Package `pycuda` found in cache
Choose version to download.
[...
9
votes
3
answers
29k
views
pycuda -- 'CUDA_ROOT not set, and nvcc not in path.'
Although i had installed pycuda and using it ok,it started (without doing sth) not to work.So,i i tried to do the install again ,but when i am doing
python configure.py --cuda-root=/usr/local/cuda/...
8
votes
1
answer
7k
views
PyCUDA: Querying Device Status (Memory specifically)
PyCUDA's documentation mentions Driver Interface calls in passing, but I'm a bit think and can't see how to get information such as 'SHARED_SIZE_BYTES' out of my code.
Can anyone point me to any ...
8
votes
1
answer
3k
views
pycuda vs theano vs pylearn2
I am currently learning programming with GPU to improve the performance of machine learning algorithms. Initially I try to learn programming cuda with pure c, then I found pycuda which to me a wrapper ...
8
votes
3
answers
9k
views
get "LogicError: explicit_context_dependent failed: invalid device context - no currently active context? " when running tensorRT in ROS
I have an inference code in TensorRT(with python). I want to run this code in ROS but I get the below error when trying to allocate buffer:
LogicError: explicit_context_dependent failed: invalid ...
8
votes
1
answer
7k
views
pip install pycuda on windows
I'm using VS2008, Win XP, latest CUDA toolkit.
I run pip install pycuda on windows and get following log from
C:\Documents and Settings\User\Application Data\pip\pip.log
I get error
LINK : fatal ...
8
votes
2
answers
7k
views
PyCUDA context error when using Flask
I am using the PyCUDA to implement the smooth_local_affine as shown here. It works well when I simply run the program on linux. But when I tried to import it under Flask context:
from ...
8
votes
1
answer
3k
views
scipy.interpolate.griddata equivalent in CUDA
I'm trying to perform Fitted Value Iteration (FVI) in python (involving approximating a 5 dimensional function using piecewise linear interpolation).
scipy.interpolate.griddata works perfectly for ...
7
votes
2
answers
11k
views
Pycuda Blocks and Grids to work with big datas
I need help to know the size of my blocks and grids.
I'm building a python app to perform metric calculations based on scipy as: Euclidean distance, Manhattan, Pearson, Cosine, joined other.
The ...
7
votes
3
answers
20k
views
PyTorch Cuda with anaconda not available
I'm using anaconda to regulate my environment,
for a project i have to use my GPU for network training.
I use pytorch for my project and i'm trying to get CUDA working.
I installed cudatoolkit, numba,...
7
votes
1
answer
5k
views
Passing a C++/CUDA class to PyCUDA's SourceModule
I have a class written in C++ that uses also some definitions from cuda_runtime.h, this is a part from opensource project named ADOL-C, you can have a look here!
This works when I'm using CUDA-C, ...
7
votes
1
answer
948
views
Could pycuda and tensorflow work together?
once tensorflow be active. it will make every cuda code crash even I use sess.close()...
the error msg is:
pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid
resource handle
The ...
6
votes
1
answer
5k
views
PyCUDA Passing variable by value to kernel
Should be simple enough; I literally want to send an int to the a SourceModule kernel declaration, where the C function
__global__......(int value,.....)
with the value being declared and called...
...
6
votes
2
answers
2k
views
How do I diagnose a CUDA launch failure due to being out of resources?
I'm getting an out-of-resources error when trying to launch a CUDA kernel (through PyCUDA), and I'm wondering if it's possible to get the system to tell me which resource it is that I'm short on. ...
6
votes
1
answer
2k
views
Disappointing results in pyCUDA benchmark for distance computing between N points
The following script was set-up for benchmark purposes. It computes the distance between N points using an Euclidean L2 norm. Three different routines are implemented:
High-level solution using the ...
6
votes
1
answer
3k
views
pyopencl - pycuda performance difference
Comparing multiple matrix multiplication calculations with pyopencl and pycuda show differences in performance.
System:
Ubuntu 14.04 with GeForce 920m
Pyopencl code:
#-*- coding: utf-8 -*-
import ...
6
votes
2
answers
891
views
Using Pycuda with PySpark - nvcc not found
My environment:
I'm using Hortonworks HDP 2.4 with Spark 1.6.1 on a small AWS EC2 cluster of 4 g2.2xlarge instances with Ubuntu 14.04. Each instance has CUDA 7.5, Anaconda Python 3.5, and Pycuda 2016....
6
votes
1
answer
538
views
pycuda seems nondeterministic
I've got a strange problem with cuda,
In the below snippet,
#include <stdio.h>
#define OUTPUT_SIZE 26
typedef $PRECISION REAL;
extern "C"
{
__global__ void test_coeff ( REAL*...
5
votes
2
answers
893
views
Pycuda messing up numpy matrix transpose
Why does the transposed matrix look differently, when converted to a pycuda.gpuarray?
Can you reproduce this? What could cause this? Am I using the wrong approach?
Example code
from pycuda import ...
5
votes
1
answer
1k
views
Why numba cuda is running slow after recalling it several times?
I am experimenting how to use cuda inside numba. However I have encountered something different from my expectation. Here is my code
from numba import cuda
@cuda.jit
def matmul(A, B, C):
"""Perform ...
5
votes
2
answers
786
views
Is there a GPU accelerated numpy.max(X, axis=0) implementation in Theano?
Do we have a GPU accelerated of version of numpy.max(X, axis=None) in Theano.
I looked into the documentation and found theano.tensor.max(X, axis=None), but it is 4-5 times slower than the numpy ...
5
votes
1
answer
10k
views
pycuda ImportError in pycuda.driver
I'm trying to compile some sources for working with my GPU. I use pycuda for this. When I compile source code, I receive some errors from Python:
C:\Users\Dmitriy\wcm>python ws_gpu.py test.dcm
...
5
votes
1
answer
1k
views
`Out of resources` error while doing loop unrolling
When I increase the unrolling from 8 to 9 loops in my kernel, it breaks with an out of resources error.
I read in How do I diagnose a CUDA launch failure due to being out of resources? that a ...
5
votes
2
answers
2k
views
How to generate random number inside pyCUDA kernel?
I am using pyCUDA for CUDA programming. I need to use random number inside kernel function. CURAND library doesn't work inside it (pyCUDA). Since, there is lot of work to be done in GPU, generating ...
5
votes
1
answer
2k
views
PyCUDA: Pow within device code tries to use std::pow, fails
Question more or less says it all.
calling a host function("std::pow<int, int> ") from a __device__/__global__ function("_calc_psd") is not allowed
from my understanding, this should be using ...
5
votes
1
answer
2k
views
Difference between memcpy_htod and to_gpu in Pycuda?
I am learning PyCUDA, and while going through the documentation on pycuda.gpuarray, I am puzzled by the difference between pycuda.driver.memcpy_htod (also _dtoh) and pycuda.gpuarray.to_gpu (also get) ...
5
votes
3
answers
2k
views
driver.Context.synchronize()- what else to take into consideration -- -a clean-up operation failed
I have this code here (modified due to the answer).
Info
32 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 46 registers, 120 bytes cmem[0], 176 bytes
cmem[2]...
5
votes
1
answer
3k
views
pycuda; nvcc fatal : Visual Studio configuration file '(null)' could not be found
I'm trying to run pycuda introductory tutorial after installing Visual C++ Express 2010 and all kinds of Nvidia drivers, SDK, etc. I get to
mod = SourceModule("""
__global__ void doublify(float *a)
{
...
5
votes
2
answers
4k
views
Installing pycuda-2013.1.1 on windows 7 64 bit
FYI, I have 64 bit version of Python 2.7 and
I followed the pycuda installation instruction to install pycuda.
And I don't have any problem running following script.
import pycuda.driver as cuda
...
5
votes
1
answer
1k
views
Print messages in PyCUDA
In simple CUDA programs we can print messages by threads by including cuPrintf.h but doing this in PyCUDA is not explained anywhere. How to do this in PyCUDA?
5
votes
2
answers
2k
views
cuda python GPU numbapro 3d loop poor performance
I am trying to set up a 3D loop with the assignment
C(i,j,k) = A(i,j,k) + B(i,j,k)
using Python on my GPU. This is my GPU:
http://www.geforce.com/hardware/desktop-gpus/geforce-gt-520/...
5
votes
1
answer
3k
views
How to handle a python list with PyCUDA?
I guess this is a rather easy question for an expert, yet I can't find any answers in the net. Given a simple case:
The problem:
listToProcess = []
for i in range(0, 10):
listToProcess.append(i)
...
5
votes
1
answer
710
views
PyCUDA: C/C++ includes?
Something that isn't really mentioned anywhere (at least that I can see) is what library functions are exposed to inline CUDA kernels.
Specifically I'm doing small / stupid matrix multiplications ...
5
votes
0
answers
759
views
Why do I get an illegal memory access when I'm calling a kernel in pycuda?
I'm trying to implement a neuron model with Hodgkin and Huxley formalism on my RTX 2080 Ti with PyCuda.
The code is quite large so I wont put all of it here.
the first part of my class is to set the ...
5
votes
0
answers
8k
views
Anaconda install pycuda
I am trying to install pycuda in computer with Windows 10 64bits, I installed the GPU Toolkit 9.1 and Anaconda 4.2 with python 3.5 64bits. I installed pycuda using the precompiled package:
pycuda‑...
5
votes
0
answers
1k
views
exchange gpu data from python (pycuda gpuarray) to opencv (cv::cuda::GpuMat) and vice versa
I have a pycuda gpuarray that I would like to feed to an opencv cuda function. As I understand there are currently no python bindings for the opencv 3 cv::cuda module. So I tried writing my own python ...
4
votes
4
answers
3k
views
processing an image using CUDA implementation, python (pycuda) or C++?
I am in a project to process an image using CUDA. The project is simply an addition or subtraction of the image.
May I ask your professional opinion, which is best and what would be the advantages ...
4
votes
1
answer
3k
views
How to use the `prepare` function from PyCUDA
I have problems passing the right parameters to the prepare function (and to the prepared_call) to allocate of shared memory in PyCUDA. I understand the error message in this way, that one of the ...
4
votes
1
answer
2k
views
100% GPU usage from CUDA code makes screen lag
I have some pyCUDA code that keeps the GPU at 100% usage and seems to hog the GPU to the point that my screen only updates every second or so.
Changing the block and grid sizes doesn't help.
Each ...
4
votes
3
answers
1k
views
Storing Kernel in Separate File - PyOpenCL
I'm trying to store the kernel part of the code, with the 3 """ , in a different file. I tried saving it as a text file and a bin file, and reading it in, but I didn't find success with it. It started ...
4
votes
2
answers
2k
views
Running optimization process with GPU using PYTHON 3.5 and Backtrader
I was giving a try to the optimization process of the Backtrader library. I see that the code run pretty well with multi-core CPU. It took around 22.352761494772228 second for the complete ...
4
votes
1
answer
3k
views
Question about pycuda._driver.LogicError: cuMemcpyDtoH failed: invalid argument
I was trying to run a code that is based off the following link
https://documen.tician.de/pycuda/tutorial.html
Running code in this link turned out to be fine.
This is my version with similar ...
4
votes
1
answer
354
views
Genetic cellular automata with PyCuda, how to efficiently pass a lot of data per cell to CUDA kernel?
I'm developing a genetic cellular automata using PyCuda. Each cell will have a lot of genome data, along with cell parameters. I'm wondering what could be a most efficient way to 1) pass cells data to ...
4
votes
1
answer
1k
views
cudaBindTextureToArray in PyCuda
Is-there a way to bind an array that is already on the gpu to a texture using PyCuda ?
There is already a cuda.bind_array_to_texref(cuda.make_multichannel_2d_array(...), texref) that binds an array ...