GPU acceleration means that GPUs accelerate computing due to a massive parallelism with thousands of threads compared to only a few threads used by conventional CPUs. A basic introduction about GPUs can be found in our article GPU while this article describes more how GPUs actually work. The first available GPUs can be considered as a simple coprocessor that was attached to one CPU. But modern GPUs such as those from NVidea have for example 128 cores on one single GPU chip and each core on a GPU can work with eight threads of instructions. In this example such a GPU can therefore concurrently execute 1024 threads and enables massive parallelization of big data applications.
The GPU architecture is optimized to enable high throughput with explicit management of on-chip GPU memory. In contrast CPU architectures are rather optimized for latency caches. GPUs are designed to compute large numbers of floating point operations in parallel. It is thus used to off-load the CPU from all data-intensive calculations. In other words the CPU instructs in their floating-point computation kernel the GPU to perform data processing during applications. The key interaction and thus major bottleneck between CPU and GPU is via Memory interactions. The on-board main memory of the CPU has data loaded into the on-chip GPU Memory for computation and results are transferred back to the on-board main memory. Hence, it is important that the bandwidth matches between both in order to ensure efficient floating point operations and true GPU acceleration.
More about GPU acceleration
We recommend to look the following video as well: