CUDA programming refers to the way of developing a program for the Compute Unified Device Architecture that is the computing engine in NVIDEA GPUs. CUDA has been used in many applications to accelerate computing and to faster analyse big data. A general introduction to CUDA can be found in our article CUDA while this article focusses on how the programming of a GPU is performed. CUDA has its own proprietary low-level and high-Level application programming Interface (API) but also shares selected computing interfaces with Open Computing Language (OpenCL) or DirectCompute from Microsoft. The high-Level API is the CUDA Runtime API and the low-level API is the CUDA Driver API. While both are quite powerful a GPU programmer must use one or the other and can not mix function calls from both APIs in the same application.
The APIs can be considered as software platform that offer access to the GPU parallel computational functionality for the execution of so-called compute kernels. CUDA API software is available for parallel computing developers that use programming languages like C. A C program is created by using CUDA C using NVIDEA extensions and certain restrictions since the program is planned to be executed on GPUs and not on conentional CPUs. After the program development the CUDA C code is compiled using specific compilers such as the PathScale Open64 C compiler. According to NVIDEA the CUDA code works without modifications on all G8X series GPU Cards such as GeForce, Quadro, or Tesla.
More on CUDA Programming
There is the following nice video about this topic: