c++ Programming Glossary: threadidx.x
Compiling Cuda code in Qt Creator on Windows http://stackoverflow.com/questions/12266264/compiling-cuda-code-in-qt-creator-on-windows a const float b float c int n int ii blockDim.x blockIdx.x threadIdx.x if ii n c ii a ii b ii void vectorAddition const float a const..
Cuda version not working while serial working http://stackoverflow.com/questions/13630817/cuda-version-not-working-while-serial-working a Polygon polygons int N int idx blockIdx.x blockDim.x threadIdx.x if idx N 2 return Polygon pol pol.addPts Point2D 0. 0. pol.addPts.. a Polygon polygons int N int idx blockIdx.x blockDim.x threadIdx.x if idx N 2 return Polygon pol pol.addPts Point2D 0. 0. pol.addPts..
count3's in cuda is very slow http://stackoverflow.com/questions/15733182/count3s-in-cuda-is-very-slow int a int N int count int id blockIdx.x blockDim.x threadIdx.x __shared__ int s_a 512 one for each thread s_a threadIdx.x a.. threadIdx.x __shared__ int s_a 512 one for each thread s_a threadIdx.x a id if id N if s_a threadIdx.x 3 if a id 3 atomicAdd count.. one for each thread s_a threadIdx.x a id if id N if s_a threadIdx.x 3 if a id 3 atomicAdd count 1 int main void int a_h host memory..
Optimizing a CUDA kernel with irregular memory accesses http://stackoverflow.com/questions/20512257/optimizing-a-cuda-kernel-with-irregular-memory-accesses n int filter_size int ai for int idx blockIdx.x blockDim.x threadIdx.x idx filter_size idx blockDim.x gridDim.x int index idx ai n..
How to separate CUDA code into multiple files http://stackoverflow.com/questions/2090974/how-to-separate-cuda-code-into-multiple-files TestDevice int deviceArray int idx blockIdx.x blockDim.x threadIdx.x deviceArray idx deviceArray idx deviceArray idx #endif Build..
Can I call cuda function calls in C++? http://stackoverflow.com/questions/3811539/can-i-call-cuda-function-calls-in-c The Callee CUDA __global__ void kernel int a int b int tx threadIdx.x switch tx case 0 a a 10 break case 1 b b 3 break default break..
Beginner CUDA - Simple var increment not working http://stackoverflow.com/questions/4408710/beginner-cuda-simple-var-increment-not-working
What are the real C++ language constructs supported by CUDA device code? http://stackoverflow.com/questions/4899425/what-are-the-real-c-language-constructs-supported-by-cuda-device-code __global__ void testKernel uint32_t ddata Foo f ddata threadIdx.x f.bar I'm also able to use widespread libraries such as Thrust..
CUDA how to get grid, block, thread size and parallalize non square matrix calculation http://stackoverflow.com/questions/5643178/cuda-how-to-get-grid-block-thread-size-and-parallalize-non-square-matrix-calcu __global__ void mAdd float A float B float C int n int k threadIdx.x blockIdx.x blockDim.x if k n C k A k B k disclaimer code written..
__syncthreads() Deadlock http://stackoverflow.com/questions/6476613/syncthreads-deadlock a kernel like this __global__ void Kernel int N int a if threadIdx.x N for int i 0 i N i a threadIdx.x Some calculation using a.. Kernel int N int a if threadIdx.x N for int i 0 i N i a threadIdx.x Some calculation using a and i __syncthreads if the number..
For nested loops with CUDA http://stackoverflow.com/questions/9921873/for-nested-loops-with-cuda N 16 index for the GPU int i1 blockDim.x blockIdx.x threadIdx.x int i2 blockDim.y blockIdx.y threadIdx.y int i3 i1 int i4 i2.. the program as follows int i1 blockDim.x blockIdx.x threadIdx.x int i2 blockDim.y blockIdx.y threadIdx.y int i3 int i4 while..
|