LLVM vs JVM

Updated: 2021-11-19

Compile time:

  • JVM: compiles source code (e.g. Java / Scala / Kotlin) to bytecode;
  • LLVM:
    • front-end (e.g. Clang / Flang) compiles source code to IR (intermediate representation, IR was designed from the beginning to be a portable assembly.)
    • back-end turns IR into a native binary executable.

Runtime:

  • JVM: garbage collection, access to resources, etc. JVM is an interpreter for Java bytecode, It has to be running during program execution. (Bigger size and higher overhead).
  • LLVM: not needed during runtime (since it generates architecture specific executables in advance)

Just-in-time and Ahead-of-time

As discussed above, LLVM is primarily ahead-of-time, JVM is primarily just-in-time.

LLVM also supports just-in-time compiling (based on the generated IR), since in some cases code needs to be generated on the fly, e.g. when using REPL in Julia. (lli: directly executes programs in LLVM bitcode format. It takes a program in LLVM bitcode format and executes it using a just-in-time compiler or an interpreter.)

GraalVM can compile JAVA application ahead-of-time to native binaries.

register-based vs stack-based

  • LLVM: low level, register-based virtual machine. It is designed to abstract the underlying hardware and draw a clean line between a compiler back-end (machine code generation) and front-end (parsing, etc.).
  • JVM: higher level, stack-based virtual machine (rather than loading values into registers, JVM bytecode loads values onto a stack and computes values from there).

Cross Language

Because of IR and bytecode, both LLVM and JVM can support multiple languages.

  • LLVM: C, C++, Rust, Swift, Fortran, Kotlin/Native, etc.
  • JVM: Java, Kotlin/JVM, Scala, Groovy, Closure, etc.
    • GraalVM: Java, Javascript, Python, Ruby, R, WASM, etc.

LLVM provide primitives for common programming languages features. E.g. functions, global variables, coroutines and C foreign-function interfaces. LLVM has many of these as standard elements in its IR. The split of front-end and backend-end frees high-level language compilers from having to target every platform (they only need to emit LLVM intermediate representation).

LLVM front-ends:

  • Clang: for C family of languages, e.g. C, C++, Objective-C
  • Flang: for Fortran, added in LLVM 13

Implementation

  • LLVM itself is written in C++; it provides C and C++ APIs.
  • OpenSDK is written in C++.
  • GraalVM is written in Java.