Polyglot: Concepts
    Overview
    Concepts
    Compile Time and Runtime
    Standard Libraries
    Build Tools
    Numerics
    Memory Management
    Zero Cost Abstractions
    Design Patterns
    Compatibility
    Functional Programming
    Bit Manipulation
    Generics
    String Interning
Polyglot: Language Notes

Memory Management

Updated: 2022-02-05

Stack vs Heap:

  • data on the stack must have known, fixed size at compile time
  • data with unknown size at compile time or size that might change must be stored on the heap
  • operational-wise (allocation, access) heap is more expensive (slower)
  • heap also requires bookkeeping (what part of code uses what data, deduplicating data, cleaning up unused data), which ownership addresses

Manual Memory Management (Explicit Deallocation)

The programmer need to explictly free the memory. The problems:

  • deallocate memory prematurely which creates a dangling pointer. Dangling pointers are pointers that no longer point to valid objects in memory.
  • If the programmer forgets to free an object they may face a memory leak as memory fills up with more and more objects. This can lead to the program slowing down or crashing if it runs out of memory.

C / C++

C and C++ provides ways to manually allocate and deallocate heap memory.

  • C: Use functions malloc() / calloc() and free().
  • C++: Use keywords new and delete.

Read more: C / C++ - Memory Management

Automated Memory Management

Reference counting

Tracking the strong references count to an object held by other objects. As soon as the references counter goes to 0, the object will be reclaimed immediately. however if you have a cycle, reference count doesn’t reach zero.

Weak refernces: To resolve the strong reference cycle, you should use a weak or unowned reference. You just add a special keyword before a variable, and then when you assign an object to that variable, the object’s references counter is not bumped up.

Generational

The generational hypothesis assumes that short lived objects, like temporary variables, are reclaimed most often. Thus, a generational garbage collector focuses on recently allocated objects.

Memory Safety Without Garbage Collectors

C++: Smart Pointers

Read more: C++ - Smart Pointers

Rust: Ownership

Ownership is how Rust manages memory. It's a set of rules that the compiler checks at compile time which don't slow down the program while running.

Ownership Rules:

  • Each value in Rust has a variable that’s called its owner.
  • There can only be one owner at a time.
  • When the owner goes out of scope, the value will be dropped.

Example:

fn main() {
  let a = vec![1,2,3];
  let b = a;
  println!("a: {:?}", b, a);
}

The compiler throws an error because a has already been dropped in the third line.

In comparison, languages with garbage collectors would run through in the second case. The garbage collector would drop A only after the last time that it is called, which is nice for the developer but not so nice in terms of memory space.

Another example:

let s1 = String::from("hello");
let s2 = s1;
// s1 is not valid anymore

// Alternatively
let s1 = String::from("hello");
let s2 = s1.clone();
// both, `s1` and `s2` are valid

Swift

Swift compiles to (native) machine code by default. Swift has ARC (Automatic Reference Counting). Garbage collection process on Android works in the runtime of your app, whereas ARC is provided at compile time.

Using the weak keyword in iOS is something normal and can even be considered good practice when you widely use the Delegation pattern. When it comes to Android, it’s not a common practice.

Due to the retain cycles, iOS developers sometimes need to write more complex code for simple things than Android developers.

Another problem is the Lapsed Listener problem. In short, when you register a listener and forget to unregister that, as a consequence you end up with a memory leak in your app. https://en.wikipedia.org/wiki/Lapsed_listener_problem

Garbage Collectors

Garbage collectors operates on the heap, not the stack.

  • mark and sweep garbage collector: two phases, unsurprisingly named mark and sweep. In the mark phase the collector traverses the heap and marks objects that are no longer needed. The follow-up sweep phase removes these objects.

Java and JVM Languages

Multiple garbage collectors provided, primarily:

  • G1 Garbage Collector: default GC since Java 9
  • Z garbage collector
  • Shenandoah Garbage Collector

Read more: Java Garbage Collection

Python

Both reference counting and generational.

  • Reference counting for non-cycle cases: when the reference count of an object reaches 0, reference counting garbage collection algorithm cleans up the object immediately.
  • Generational garbage collection for cycles: a type of trace-based garbage collection. It can break cyclic references and delete the unused objects even if they are referred by themselves. Newly created objects are put in the Generation 0 list. A list is created for objects to discard. Reference cycles are detected. If an object has no outside references it is discarded. The objects who survived after this process are put in the Generation 1 list. The same steps are applied to the Generation 1 list. Survivals from the Generation 1 list are put in the Generation 2 list. The objects in the Generation 2 list stay there until the end of the program execution.

Go

Go prefers to allocate memory on the stack, so most memory allocations will end up there. This means that Go has a stack per goroutine and when possible Go will allocate variables to this stack.

Go’s garbage collector is a non-generational concurrent, tri-color mark and sweep garbage collector.

A generational garbage collector is not necessary in Go: Compiler optimisations allow the Go compiler to allocate objects with a known lifetime to the stack. This means fewer objects will be on the heap, so fewer objects will be garbage collected.

Concurrent means that the collector runs at the same time as mutator threads.

C++23 remove garbage collection support

C++ itself does not have garbage collectors, but it provides APIs to build garbage collectors. Examples of virtual machines written in C++ with support for garbage collection:

  • WebKit’s JavaScriptCore use a garbage collector called Riptide.
  • Chromium’s Blink GC called Oilpan.
  • The V8 JavaScript engine used by Chromium also has its own garbage collector called Orinoco.
  • Firefox’s SpiderMonkey JavaScript engine
  • Lua and LuaJIT

C++23 decides to remove garbage collection support. Because each garbage collector has its own set of design criteria which influence how the language itself is implemented. These languages use similar ideas, but the design is different in each case, and the constraints on C++ code are different.

Garbage Collection in C++ is clearly useful for particular applications. However, Garbage Collection as specified by the Standard is not useful for those applications. "