logo

Programming Languages - Memory Management

Last Updated: 2024-01-21

Stack vs Heap:

  • data on the stack must have known, fixed size at compile time
  • data with unknown size at compile time or size that might change must be stored on the heap
  • operational-wise (allocation, access) heap is more expensive (slower)
  • heap also requires bookkeeping (what part of code uses what data, deduplicating data, cleaning up unused data), which ownership addresses

Manual Memory Management (Explicit Deallocation)

The programmer need to explictly free the memory. The problems:

  • deallocate memory prematurely which creates a dangling pointer. Dangling pointers are pointers that no longer point to valid objects in memory.
  • If the programmer forgets to free an object they may face a memory leak as memory fills up with more and more objects. This can lead to the program slowing down or crashing if it runs out of memory.

C / C++

C and C++ provides ways to manually allocate and deallocate heap memory.

  • C: Use functions malloc() / calloc() and free().
  • C++: Use keywords new and delete.

Read more: C / C++ - Memory Management

Automated Memory Management

Reference counting

Tracking the strong references count to an object held by other objects. As soon as the references counter goes to 0, the object will be reclaimed immediately. however if you have a cycle, reference count doesn’t reach zero.

Weak refernces: To resolve the strong reference cycle, you should use a weak or unowned reference. You just add a special keyword before a variable, and then when you assign an object to that variable, the object’s references counter is not bumped up.

Generational

The generational hypothesis assumes that short lived objects, like temporary variables, are reclaimed most often. Thus, a generational garbage collector focuses on recently allocated objects.

Memory Safety Without Garbage Collectors

C++: Smart Pointers

Read more: C++ - Smart Pointers

Rust: Ownership

Ownership is how Rust manages memory. It's a set of rules that the compiler checks at compile time which don't slow down the program while running.

Ownership Rules:

  • Each value in Rust has a variable that’s called its owner.
  • There can only be one owner at a time.
  • When the owner goes out of scope, the value will be dropped.

Example:

fn main() {
  let a = vec![1,2,3];
  let b = a;
  println!("a: {:?}", b, a);
}

The compiler throws an error because a has already been dropped in the third line.

In comparison, languages with garbage collectors would run through in the second case. The garbage collector would drop A only after the last time that it is called, which is nice for the developer but not so nice in terms of memory space.

Another example:

let s1 = String::from("hello");
let s2 = s1;
// s1 is not valid anymore

// Alternatively
let s1 = String::from("hello");
let s2 = s1.clone();
// both, `s1` and `s2` are valid

Swift

Swift compiles to (native) machine code by default. Swift has ARC (Automatic Reference Counting). Garbage collection process on Android works in the runtime of your app, whereas ARC is provided at compile time.

Using the weak keyword in iOS is something normal and can even be considered good practice when you widely use the Delegation pattern. When it comes to Android, it’s not a common practice.

Due to the retain cycles, iOS developers sometimes need to write more complex code for simple things than Android developers.

Another problem is the Lapsed Listener problem. In short, when you register a listener and forget to unregister that, as a consequence you end up with a memory leak in your app. https://en.wikipedia.org/wiki/Lapsed_listener_problem

Garbage Collectors

Garbage collectors operates on the heap, not the stack.

  • Moving GC: move the objects to a new part of memory and leave behind a forwarding pointer that is later used to update all the application's pointers.
  • Mark-sweep GC: two phases - mark and sweep.
    • mark phase: the collector traverses the heap and marks objects that are no longer needed.
    • sweep phase: removes these objects.

Java and JVM Languages

Multiple garbage collectors provided, primarily:

  • G1 Garbage Collector: default GC since Java 9
  • Z garbage collector
  • Shenandoah Garbage Collector

Read more: Java Garbage Collection

Python

Both reference counting and generational.

  • Reference counting for non-cycle cases: when the reference count of an object reaches 0, reference counting garbage collection algorithm cleans up the object immediately.
  • Generational garbage collection for cycles: a type of trace-based garbage collection. It can break cyclic references and delete the unused objects even if they are referred by themselves. Newly created objects are put in the Generation 0 list. A list is created for objects to discard. Reference cycles are detected. If an object has no outside references it is discarded. The objects who survived after this process are put in the Generation 1 list. The same steps are applied to the Generation 1 list. Survivals from the Generation 1 list are put in the Generation 2 list. The objects in the Generation 2 list stay there until the end of the program execution.

Go

There's new but there's no delete. Memory is garbage collected after it's no longer accessible.

Go prefers to allocate memory on the stack, so most memory allocations will end up there. This means that Go has a stack per goroutine and when possible Go will allocate variables to this stack.

Go’s garbage collector is a non-generational concurrent, tri-color mark and sweep garbage collector.

A generational garbage collector is not necessary in Go: Compiler optimisations allow the Go compiler to allocate objects with a known lifetime to the stack. This means fewer objects will be on the heap, so fewer objects will be garbage collected.

Concurrent means that the collector runs at the same time as mutator threads.

Using new doesn't imply using the heap:

// stack.go
func get() int {
    n := new(int)
    return *n
}

n is not on heap:

$ go tool compile -m stack.go

stack.go:3: can inline get
stack.go:4: get new(int) does not escape

And not all values in the heap are created with new:

// heap.go
func get() *int {
    n := 4
    return &n
}

n is moved to heap:

$ go tool compile -m heap.go

heap.go:3: can inline get
heap.go:4: moved to heap: n
heap.go:5: &n escapes to heap

C++23 remove garbage collection support

C++ itself does not have garbage collectors, but it provides APIs to build garbage collectors. Examples of virtual machines written in C++ with support for garbage collection:

  • WebKit’s JavaScriptCore use a garbage collector called Riptide.
  • Chromium’s Blink GC called Oilpan.
  • The V8 JavaScript engine used by Chromium also has its own garbage collector called Orinoco.
  • Firefox’s SpiderMonkey JavaScript engine
  • Lua and LuaJIT

C++23 decides to remove garbage collection support. Because each garbage collector has its own set of design criteria which influence how the language itself is implemented. These languages use similar ideas, but the design is different in each case, and the constraints on C++ code are different.

Garbage Collection in C++ is clearly useful for particular applications. However, Garbage Collection as specified by the Standard is not useful for those applications. "

C++

NO garbage collector.

// use stack
char a = 'a';
// no need to free a

// use heap
char *b = new char('b');
delete b;

Smart pointers:

// memory freed when ptr_a goes out of scope
unique_ptr<char> ptr_a(new char('a'));

// reference counted pointer
shared_ptr<char> ptr_b(new char('b'));

// pointer that does not increase reference counts
weak_ptr<char> ptr_c = ptr_b;

// memory freed when ptr_d goes out of scope
// differs from unique_ptr in copy semantics
auto_ptr<char> ptr_d(new char('d'));

Go

Go has two allocation primitives: new and make. Both are functions. Or use Composite Literals to create instances.

Composite Literals

To create a new instance:

// by field order.
f := File{fd, name, nil, 0}

// by field names; missing fiels will have zero values.
f := File{fd: fd, name: name}

new()

Unlike its namesakes in some other languages it does not initialize the memory, it only zeros it. new(T) returns a pointer (*T) to a newly allocated zero value of type T. (As Go doesn’t have constructors, the value will be initialised to T's zero value.)

The expressions new(Foo) and &Foo{} are equivalent.

make()

make() is used for slices, maps, or channels. make() allocates memory on the heap and initializes and puts zero or empty strings into values.

These three types represent references to data structures that must be initialized before use, e.g. for slice: a pointer to the underlying array, length, and capacity. ("initialized" means having meaning values, which may not be zero values.)

Unlike new(), make() returns the same type as its argument.

// make a map
a := make(map[int]string)

// make a slice
b := make([]string, 2, 10)

// make a channel
c := make(chan int, 1)
c <- 10
fmt.Println(<-c)

make() vs new()

make() is only used to create slices and maps and channels. The return of make() is initialized and ready to use. new() returns a pointer:

p := new(chan int)   // p has type: *chan int
c := make(chan int)  // c has type: chan int

For example:

  • v := make([]int, 10, 100): the slice v refers to a new array, with length 10 and capacity 100.
  • p := new([]int) returns a pointer to a newly allocated zeroed slice structure, *p == nil.

Rust

NO garbage collector.

// use stack
let a: char = 'a';

// use heap
let b = Box::new('b');

// b will be freed when it is out of scope
// or force free by mem::drop
mem::drop(b);

// reference counted pointer
let rc = Rc::new('c');

// atomically reference counted pointer
// can be safely used across threads
let arc = Arc::new('d');

Java

With garbage collection.

// use stack
char a = 'a';

// use heap
Character b = new Character('b');

// no way to free a or b, will be garbage collected

// reference that does not impact garbage collection
WeakReference<Character> weakRef = new WeakReference<>(ref);

// soft reference
SoftReference<Character> softRef = new SoftReference<>(ref);

All soft references to softly-reachable objects are guaranteed to be cleared before a JVM throws an OutOfMemoryError.

Weak Reference

Weak reference is a reference that does not protect the referenced object from collection by a garbage collector, the object will be garbage collected once they are not reachable elsewhere, even though this weak reference is still pointing to it.

Some garbage-collected languages support weak references, like Java (WeakReference class), C#, Python (weakref module), etc.