Notes about computer architecture and hardware.
Computer Architecture = microarchitecture + instruction set architecture
- CISC: Complex Instruction Set Computer
- RISC: Reduced Instruction Set Computer, simplifies the processor by efficiently implementing only the instructions that are frequently used in programs
- x86: CISC, Intel's, popular in data center/servers.
- Power: RISC, created by IBM. OpenPOWER Foundation Google's data centers run on both x86 and Power
- ARM: RISC, power efficient(comparing to x86), iPhone's chips. AWS uses ARM for some EC2 types. wiki
From this article
Google found that the performance of its web search algorithm, the heart and soul of the company, scaled well with both the number of cores and the number of threads available to it. IBM's POWER9 processor is a many-core, many-thread beast. Variants of the chip range from 12 to 24 cores, with eight threads per core for the 12-core version and four threads per core for the 24-core version. Intel's chips support only two threads per core via hyperthreading.
They're not well suited for workloads that don't benefit from more threads, which is why the market-share ceiling for POWER isn't all that high.
ASIC: Application-Specific Integrated Circuit. Customized for a particular use. Notable examples:
- TPU: Tensor processing unit, developed by Google to accelerate computing in Neural Networks.
- AWS Nitro System: ASIC designed by Annapurna Labs is used to offload network, storage and management work from the main CPU.
- High efficiency Bitcoin miner.
FPGA: Field-programmable gate arrays.
- Programmable logic blocks and programmable interconnects allow the same FPGA to be used in many different applications.
- For prototypes, smaller designs or lower production volumes, FPGAs may be more cost effective than an ASIC design, even in production.
NVM Express (NVMe) : use PCIe bus instead of SATA for SSD, reduces I/O overhead and brings various performance improvements relative to previous logical-device interfaces, including multiple, long command queues, and reduced latency.
PUE: power utilization efficiency:
- PUE=1.0: all power is used by servers
- PUE=2.0: half of power is used by the building, half by servers
25 to 40w
Unlike SSD, RAM is volatile so needs the power all the time to store the data.
- DDR1 (2.5 Volts): 4 to 5.5 W
- DDR2 (1.8 Volts): 3 to 4.5 W
- DDR3 (1.5 Volts): 2 to 3 W
- DDR4 (1.2 Volts)
- DDR5 (1.1 Volts)
- SSD: 0.6 to 2.8W
- HDD(3.5"): 6.5 to 9W
- 100 to 250W
NUMA architecture: it takes a different amount of time to access different parts of memory, i.e. memories are not created equal: some are local and some are remote. Each processor has a local memory with low latency and high bandwidth, while remote memory is slower to access.