Compute APIs: CUDA, DirectCompute, OpenCL, etc.
Use more than one kind of processor or cores
- Digital Signal Processors (DSP)
- Field Programmable Gate Arrays (FPGAs)
- other specialized processing capabilities
The main difference between OpenGL and OpenCL is that the OpenGL is used for graphics programming while the OpenCL is used for heterogeneous computing.
CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing unit (GPU) for general purpose processing – an approach called general-purpose computing on GPUs (GPGPU). CUDA is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements, for the execution of compute kernels.
NUMA architecture: it takes a different amount of time to access different parts of memory, i.e. memories are not created equal: some are local and some are remote. Each processor has a local memory with low latency and high bandwidth, while remote memory is slower to access.