Programming Languages - Compilers and Runtimes
- runtime: some languages have big runtimes, or full virtual machines, like Java (JVM); some languages have small runtimes, just a library to provide critical functions, like
libc
for C, andruntime
lib for Go. - GC refers to runtime GC; Swift has Automatic Reference Counting but at the compile time.
- In Go, GC is provided by the
runtime
library; while in Java / JavaScript, GC is provided by the virtual machine (JVM / V8).
VMs
- JVM (Java Virtual Machine): Java, Scala, Groovy, Kotlin
- Why Groovy is more Popular than Scala: Groovy is 100% Java compatible because it IS Java; take any Java class, change the extension to from '.java' to '.groovy' and it WILL compile (NOTE: one must have groovy library installed).
- CLR (Common Language Runtime): C#
- HHVM (HipHop Virtual Machine): PHP/Hack. HHVM JIT compilation, executed PHP or Hack code is first transformed into intermediate HipHop bytecode (HHBC), which is then dynamically translated into x86-64 machine code, optimized, and natively executed. This contrasts with PHP's usual interpreted execution, in which the Zend Engine transforms PHP source code into opcodes that serve as a form of bytecode, and executes the opcodes directly on the Zend Engine's virtual CPU.
LLVM
- LLVM is the backend AND the umbrella project name.
- LLVM is NOT a traditional virtual machine.
- It is NOT an acronym (though originally it stands for Low Level Virtual Machine).
- It contains modularized compiler components and tool chains.
- Clang: frontend for C/C++/Object-C/Object-C++.
- Backend converts the LLVM Intermediate Representation (IR) to code for a specified machine or other languages (hardware OR software target)
- https://llvm.org
- Emscripten: an LLVM/Clang-based compiler that compiles C and C++ source code to WebAssembly.
rustc
, translates Rust code into low-level LLVM IR.- performance-critical code is usually build with LLVM-based toolchains.
Clang vs GCC
2 Primary C/C++ compilers: Clang and GCC.
Clang | GCC | |
---|---|---|
License | Apache 2.0 | GPL / LGPL |
C++ Standard Library | libc++ | libstdc++ |
C Standard Library | glibc | |
Parent Project | LLVM | GNU |
Website | https://clang.llvm.org/ | https://gcc.gnu.org/ |
Notes:
- Clang was originated from Apple but now widely used (You will find Google, Facebook Microsoft, Intel, Qualcomm, Huawei in LLVM Found ation sponsor list https://foundation.llvm.org/docs/sponsors/)
- Emerging languages are using the LLVM frameworks, such as Swift, Rust.
- GCC was deprecated in Android in 2019 and removed in 2020
- GCC is the official compiler for the GNU and Linux systems
- Clang supports a wide variety of C standard library implementations.
Conclusion: use Clang/LLVM if you do not have to use GCC.
Install Clang
On Debian / Ubuntu:
$ sudo apt install clang libc++-dev libc++abi-dev
Verify:
$ clang --version
clang version xx.x.x
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
swc
A typescript / javascript compiler, written in Rust. (https://swc.rs/)
Used by tools like Next.js, Parcel, and Deno.
AOT vs JIT
- static compiler, or ahead-of-time(AOT) compiler: at compile-time, e.g. from
.java
to bytecode.class
. - dynamic compiler, or just-in-time(JIT) compiler: at run-time, e.g. in Java, compiles bytecode to native instructions.
bytecode is portable, but native code is not.
Java JIT 2 flavors:
- client-side compiler(with
-client
option): fewer resources, sensitive to startup time. - server-side compiler(with
-server
option): long running, more advanced optimizations.
Besides Java, PyPy provides JIT for Python, V8 compiles javascript directly to native machine code.
Compiler vs Interpreter
A script is program code that doesn’t need pre-processing (e.g. compiling) before being run.
- compiled to standalone executables: C/C++, COBOL
- ran in an interpreter: Perl, Tcl
- Java need for both a bytecode compiler and a runtime interpreter.
- Python and R have no compile-time type-safety,
- Compiled-to-JavaScript language: dart, typescript, flow