Questions tagged [pycuda]

PyCUDA is the Python module which provides a comprehensive pythonic interface to the NVIDIA CUDA GPU computing environment.

pycuda
Filter by
Sorted by
Tagged with
21 votes
2 answers
18k views

What is the difference between PyCUDA and NumbaPro CUDA Python?

I'm new to CUDA and am trying to figure out whether PyCUDA (free) or NumbaPro CUDA Python (not free) would be better for me (assuming the library cost is not an issue). Both seem to require that you ...
PDiracDelta's user avatar
  • 2,428
20 votes
5 answers
12k views

pyCUDA vs C performance differences?

I'm new to CUDA programming and I was wondering how the performance of pyCUDA is compared to programs implemented in plain C. Will the performance be roughly the same? Are there any bottle necks that ...
memyself's user avatar
  • 12.3k
18 votes
2 answers
19k views

Python Multiprocessing with PyCUDA

I've got a problem that I want to split across multiple CUDA devices, but I suspect my current system architecture is holding me back; What I've set up is a GPU class, with functions that perform ...
Bolster's user avatar
  • 7,686
14 votes
4 answers
5k views

pyCUDA with Flask gives pycuda._driver.LogicError: cuModuleLoadDataEx

I want to run a pyCUDA code on a flask server. The file runs correctly directly using python3 but fails when the corresponding function is called using flask. Here is the relevant code: cudaFlask.py:...
arpanmangal's user avatar
  • 1,790
14 votes
2 answers
562 views

No output after using PyCUDA

I've installed PyCUDA using pip. I tried this in two computers. One with a fresh install of Python 3.7.1 and one with Python 3.6.5. Everything fails after using PuCUDA with no error message. The ...
Panos Kalatzantonakis's user avatar
10 votes
5 answers
20k views

src/cpp/cuda.hpp:14:10: fatal error: cuda.h: No such file or directory

when I install pycuda by this instruction: pip install pycuda but there is an error: src/cpp/cuda.hpp:14:10: fatal error: cuda.h: No such file or directory but I have installed the cuda toolkit....
kaikai_sk's user avatar
  • 121
10 votes
3 answers
30k views

trying to install pycuda, getting zip error?

windows 10, python 2.7 64 bit hello, following a guide to this step : pip install pipwin pipwin install pycuda gives me those options Package `pycuda` found in cache Choose version to download. [...
hmmmbob's user avatar
  • 1,177
9 votes
3 answers
29k views

pycuda -- 'CUDA_ROOT not set, and nvcc not in path.'

Although i had installed pycuda and using it ok,it started (without doing sth) not to work.So,i i tried to do the install again ,but when i am doing python configure.py --cuda-root=/usr/local/cuda/...
George's user avatar
  • 5,511
8 votes
1 answer
7k views

PyCUDA: Querying Device Status (Memory specifically)

PyCUDA's documentation mentions Driver Interface calls in passing, but I'm a bit think and can't see how to get information such as 'SHARED_SIZE_BYTES' out of my code. Can anyone point me to any ...
Bolster's user avatar
  • 7,686
8 votes
1 answer
3k views

pycuda vs theano vs pylearn2

I am currently learning programming with GPU to improve the performance of machine learning algorithms. Initially I try to learn programming cuda with pure c, then I found pycuda which to me a wrapper ...
user1754197's user avatar
8 votes
3 answers
9k views

get "LogicError: explicit_context_dependent failed: invalid device context - no currently active context? " when running tensorRT in ROS

I have an inference code in TensorRT(with python). I want to run this code in ROS but I get the below error when trying to allocate buffer: LogicError: explicit_context_dependent failed: invalid ...
Mahsa's user avatar
  • 486
8 votes
1 answer
7k views

pip install pycuda on windows

I'm using VS2008, Win XP, latest CUDA toolkit. I run pip install pycuda on windows and get following log from C:\Documents and Settings\User\Application Data\pip\pip.log I get error LINK : fatal ...
mrgloom's user avatar
  • 20.8k
8 votes
2 answers
7k views

PyCUDA context error when using Flask

I am using the PyCUDA to implement the smooth_local_affine as shown here. It works well when I simply run the program on linux. But when I tried to import it under Flask context: from ...
cloudwayx's user avatar
8 votes
1 answer
3k views

scipy.interpolate.griddata equivalent in CUDA

I'm trying to perform Fitted Value Iteration (FVI) in python (involving approximating a 5 dimensional function using piecewise linear interpolation). scipy.interpolate.griddata works perfectly for ...
user1726633's user avatar
7 votes
2 answers
11k views

Pycuda Blocks and Grids to work with big datas

I need help to know the size of my blocks and grids. I'm building a python app to perform metric calculations based on scipy as: Euclidean distance, Manhattan, Pearson, Cosine, joined other. The ...
Vinnicyus Gracindo's user avatar
7 votes
3 answers
20k views

PyTorch Cuda with anaconda not available

I'm using anaconda to regulate my environment, for a project i have to use my GPU for network training. I use pytorch for my project and i'm trying to get CUDA working. I installed cudatoolkit, numba,...
Enforcerke's user avatar
7 votes
1 answer
5k views

Passing a C++/CUDA class to PyCUDA's SourceModule

I have a class written in C++ that uses also some definitions from cuda_runtime.h, this is a part from opensource project named ADOL-C, you can have a look here! This works when I'm using CUDA-C, ...
Banana's user avatar
  • 1,316
7 votes
1 answer
948 views

Could pycuda and tensorflow work together?

once tensorflow be active. it will make every cuda code crash even I use sess.close()... the error msg is: pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid resource handle The ...
Chi-Fang Hsieh's user avatar
6 votes
1 answer
5k views

PyCUDA Passing variable by value to kernel

Should be simple enough; I literally want to send an int to the a SourceModule kernel declaration, where the C function __global__......(int value,.....) with the value being declared and called... ...
Bolster's user avatar
  • 7,686
6 votes
2 answers
2k views

How do I diagnose a CUDA launch failure due to being out of resources?

I'm getting an out-of-resources error when trying to launch a CUDA kernel (through PyCUDA), and I'm wondering if it's possible to get the system to tell me which resource it is that I'm short on. ...
Eli Stevens's user avatar
  • 1,447
6 votes
1 answer
2k views

Disappointing results in pyCUDA benchmark for distance computing between N points

The following script was set-up for benchmark purposes. It computes the distance between N points using an Euclidean L2 norm. Three different routines are implemented: High-level solution using the ...
Rakulan S.'s user avatar
6 votes
1 answer
3k views

pyopencl - pycuda performance difference

Comparing multiple matrix multiplication calculations with pyopencl and pycuda show differences in performance. System: Ubuntu 14.04 with GeForce 920m Pyopencl code: #-*- coding: utf-8 -*- import ...
Jesse's user avatar
  • 370
6 votes
2 answers
891 views

Using Pycuda with PySpark - nvcc not found

My environment: I'm using Hortonworks HDP 2.4 with Spark 1.6.1 on a small AWS EC2 cluster of 4 g2.2xlarge instances with Ubuntu 14.04. Each instance has CUDA 7.5, Anaconda Python 3.5, and Pycuda 2016....
zenlc2000's user avatar
  • 451
6 votes
1 answer
538 views

pycuda seems nondeterministic

I've got a strange problem with cuda, In the below snippet, #include <stdio.h> #define OUTPUT_SIZE 26 typedef $PRECISION REAL; extern "C" { __global__ void test_coeff ( REAL*...
user1726633's user avatar
5 votes
2 answers
893 views

Pycuda messing up numpy matrix transpose

Why does the transposed matrix look differently, when converted to a pycuda.gpuarray? Can you reproduce this? What could cause this? Am I using the wrong approach? Example code from pycuda import ...
Framester's user avatar
  • 34.3k
5 votes
1 answer
1k views

Why numba cuda is running slow after recalling it several times?

I am experimenting how to use cuda inside numba. However I have encountered something different from my expectation. Here is my code from numba import cuda @cuda.jit def matmul(A, B, C): """Perform ...
Peter Deng's user avatar
5 votes
2 answers
786 views

Is there a GPU accelerated numpy.max(X, axis=0) implementation in Theano?

Do we have a GPU accelerated of version of numpy.max(X, axis=None) in Theano. I looked into the documentation and found theano.tensor.max(X, axis=None), but it is 4-5 times slower than the numpy ...
hrs's user avatar
  • 497
5 votes
1 answer
10k views

pycuda ImportError in pycuda.driver

I'm trying to compile some sources for working with my GPU. I use pycuda for this. When I compile source code, I receive some errors from Python: C:\Users\Dmitriy\wcm>python ws_gpu.py test.dcm ...
iDom's user avatar
  • 115
5 votes
1 answer
1k views

`Out of resources` error while doing loop unrolling

When I increase the unrolling from 8 to 9 loops in my kernel, it breaks with an out of resources error. I read in How do I diagnose a CUDA launch failure due to being out of resources? that a ...
Framester's user avatar
  • 34.3k
5 votes
2 answers
2k views

How to generate random number inside pyCUDA kernel?

I am using pyCUDA for CUDA programming. I need to use random number inside kernel function. CURAND library doesn't work inside it (pyCUDA). Since, there is lot of work to be done in GPU, generating ...
Bhaskar Dhariyal's user avatar
5 votes
1 answer
2k views

PyCUDA: Pow within device code tries to use std::pow, fails

Question more or less says it all. calling a host function("std::pow<int, int> ") from a __device__/__global__ function("_calc_psd") is not allowed from my understanding, this should be using ...
Bolster's user avatar
  • 7,686
5 votes
1 answer
2k views

Difference between memcpy_htod and to_gpu in Pycuda?

I am learning PyCUDA, and while going through the documentation on pycuda.gpuarray, I am puzzled by the difference between pycuda.driver.memcpy_htod (also _dtoh) and pycuda.gpuarray.to_gpu (also get) ...
Pippi's user avatar
  • 2,511
5 votes
3 answers
2k views

driver.Context.synchronize()- what else to take into consideration -- -a clean-up operation failed

I have this code here (modified due to the answer). Info 32 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads ptxas info : Used 46 registers, 120 bytes cmem[0], 176 bytes cmem[2]...
George's user avatar
  • 5,511
5 votes
1 answer
3k views

pycuda; nvcc fatal : Visual Studio configuration file '(null)' could not be found

I'm trying to run pycuda introductory tutorial after installing Visual C++ Express 2010 and all kinds of Nvidia drivers, SDK, etc. I get to mod = SourceModule(""" __global__ void doublify(float *a) { ...
Konsta's user avatar
  • 367
5 votes
2 answers
4k views

Installing pycuda-2013.1.1 on windows 7 64 bit

FYI, I have 64 bit version of Python 2.7 and I followed the pycuda installation instruction to install pycuda. And I don't have any problem running following script. import pycuda.driver as cuda ...
Tae-Sung Shin's user avatar
5 votes
1 answer
1k views

Print messages in PyCUDA

In simple CUDA programs we can print messages by threads by including cuPrintf.h but doing this in PyCUDA is not explained anywhere. How to do this in PyCUDA?
username_4567's user avatar
5 votes
2 answers
2k views

cuda python GPU numbapro 3d loop poor performance

I am trying to set up a 3D loop with the assignment C(i,j,k) = A(i,j,k) + B(i,j,k) using Python on my GPU. This is my GPU: http://www.geforce.com/hardware/desktop-gpus/geforce-gt-520/...
Charles's user avatar
  • 977
5 votes
1 answer
3k views

How to handle a python list with PyCUDA?

I guess this is a rather easy question for an expert, yet I can't find any answers in the net. Given a simple case: The problem: listToProcess = [] for i in range(0, 10): listToProcess.append(i) ...
user3085931's user avatar
  • 1,757
5 votes
1 answer
710 views

PyCUDA: C/C++ includes?

Something that isn't really mentioned anywhere (at least that I can see) is what library functions are exposed to inline CUDA kernels. Specifically I'm doing small / stupid matrix multiplications ...
Bolster's user avatar
  • 7,686
5 votes
0 answers
759 views

Why do I get an illegal memory access when I'm calling a kernel in pycuda?

I'm trying to implement a neuron model with Hodgkin and Huxley formalism on my RTX 2080 Ti with PyCuda. The code is quite large so I wont put all of it here. the first part of my class is to set the ...
ymmx's user avatar
  • 4,925
5 votes
0 answers
8k views

Anaconda install pycuda

I am trying to install pycuda in computer with Windows 10 64bits, I installed the GPU Toolkit 9.1 and Anaconda 4.2 with python 3.5 64bits. I installed pycuda using the precompiled package: pycuda‑...
Mauricio Ruiz's user avatar
5 votes
0 answers
1k views

exchange gpu data from python (pycuda gpuarray) to opencv (cv::cuda::GpuMat) and vice versa

I have a pycuda gpuarray that I would like to feed to an opencv cuda function. As I understand there are currently no python bindings for the opencv 3 cv::cuda module. So I tried writing my own python ...
Wizard's user avatar
  • 295
4 votes
4 answers
3k views

processing an image using CUDA implementation, python (pycuda) or C++?

I am in a project to process an image using CUDA. The project is simply an addition or subtraction of the image. May I ask your professional opinion, which is best and what would be the advantages ...
ardiyu07's user avatar
  • 1,800
4 votes
1 answer
3k views

How to use the `prepare` function from PyCUDA

I have problems passing the right parameters to the prepare function (and to the prepared_call) to allocate of shared memory in PyCUDA. I understand the error message in this way, that one of the ...
Framester's user avatar
  • 34.3k
4 votes
1 answer
2k views

100% GPU usage from CUDA code makes screen lag

I have some pyCUDA code that keeps the GPU at 100% usage and seems to hog the GPU to the point that my screen only updates every second or so. Changing the block and grid sizes doesn't help. Each ...
Frobot's user avatar
  • 1,244
4 votes
3 answers
1k views

Storing Kernel in Separate File - PyOpenCL

I'm trying to store the kernel part of the code, with the 3 """ , in a different file. I tried saving it as a text file and a bin file, and reading it in, but I didn't find success with it. It started ...
RandN88's user avatar
  • 101
4 votes
2 answers
2k views

Running optimization process with GPU using PYTHON 3.5 and Backtrader

I was giving a try to the optimization process of the Backtrader library. I see that the code run pretty well with multi-core CPU. It took around 22.352761494772228 second for the complete ...
Jaffer Wilson's user avatar
4 votes
1 answer
3k views

Question about pycuda._driver.LogicError: cuMemcpyDtoH failed: invalid argument

I was trying to run a code that is based off the following link https://documen.tician.de/pycuda/tutorial.html Running code in this link turned out to be fine. This is my version with similar ...
macman's user avatar
  • 91
4 votes
1 answer
354 views

Genetic cellular automata with PyCuda, how to efficiently pass a lot of data per cell to CUDA kernel?

I'm developing a genetic cellular automata using PyCuda. Each cell will have a lot of genome data, along with cell parameters. I'm wondering what could be a most efficient way to 1) pass cells data to ...
a5kin's user avatar
  • 1,355
4 votes
1 answer
1k views

cudaBindTextureToArray in PyCuda

Is-there a way to bind an array that is already on the gpu to a texture using PyCuda ? There is already a cuda.bind_array_to_texref(cuda.make_multichannel_2d_array(...), texref) that binds an array ...
nbonneel's user avatar
  • 3,316

1
2 3 4 5
9