CUDA Compute Unified Device Architecture
CUDA stands for Compute Unified Device Architecture and is a massive parallel computing architecture that was invented by NVIDEA. It is used as the computing engine within NVIDEA GPUs and is a unique architecture well suited to tackle large computations in big data processing. A General introduction to GPUs can be found in our article GPU. Examples of two generations of the CUDA architecture are NVIDEA Tesla (2007) and NVIDEA Fermi (2010). Both are used for parallel processing in large GPU clusters in high performance computing Systems today. Hence the architecture is essentially driven by requirements for high performance computing accelerating floating point operations and matrix multiplications. It coined the term general-purpose computing on GPUs called GPGPU today.
CUDA GPUs are used to accelerate a wide variety of applications that are not related to graphics. Examples include computational biology or applications that take advantage of the BOINC distributed computing system client. One of the reasons is that GPUs are very useful for algorithms that process large data sets in parallel. This is the case in molecular dynamics but also in fast sort algorithms for example. In the context of large-data centres it is difficult to directly run those applications on virtual machines provided by hardware or infrastructure as a service. But there is vCUDA that virtualizes the CUDA API library and that can be installed on guest operating systems. Each call to the CUDA API from applications that run on the guest operating system is intercepted by vCUDA and this call is redirected to the CUDA API on the host operating system.
More about CUDA
There is the following nice video about this topic: