Concurrent

Updated: 2018-12-03

Concurrent vs Parallel

  • Parallel: two tasks run in parallel, simultaneously
  • Concurrent: two tasks may run simultaneously, or alternatively(on a single core)

2 Ways to Improve Concurrency

  • Async, non-blocking: Node.js/V8, Nginx
  • Multi-thread

We usually do not use locks or spawn new threads in languages like Javascript or PHP, because Javascript executes in a single-threaded event loop, and PHp runs in Nginx which doesn't have a thread based architecture whatever.

On the other hand we need to understand how multi-thread works in languages like Java or C++.

Actor Model

  • Actors have their own mailbox
  • you have to have a reference (Akka) or PID (Erlang) to the other actor in order to send it a message
  • fault tolerance
  • actor can have mutable state inside of it and a guarantee of no multithreaded access to the state
  • sender is async, only receiver can be blocked
  • good for distributed system

Examples:

  • Scala: Akka
  • Erlang

Communicating Sequential Processes(CSP)

  • The idea is that there can be two processes or threads that act independently of one another but share a "channel," which one process/thread puts data into and the other process/thread consumes.
  • the channel is shared, and can be shared by multiple producers and consumers
  • limited to the current runtime and cannot be distributed, even between two runtimes on the same physical box.
  • only buffered channel is async; sender/receiver is sync, may be blocked

Examples:

  • Go: Goroutines
  • Clojure: core.async

coroutines vs goroutines

  • goroutines imply parallelism; coroutines in general do not
  • goroutines communicate via channels; coroutines communicate via yield and resume operations

Work-sharing vs Work-stealing

Source: https://rakyll.org/scheduler/

In multi-threaded computation, two paradigms have emerged in scheduling: work sharing and work stealing.

  • Work-sharing: When a processor generates new threads, it attempts to migrate some of them to the other processors with the hopes of them being utilized by the idle/underutilized processors.
  • Work-stealing: An underutilized processor actively looks for other processor’s threads and “steal” some.

IPC

IPC: Inter Process Communication. Most operating systems support IPC resources, such as pipes and sockets. IPC is used not just for communication between processes on the same system, but processes on different systems.

Processes and Threads

  • Processes: self-contained execution environment; Inter Process Communication (IPC) by pipes and sockets

  • Threads: share the process's resources, including memory and open files; problems to take care: thread interference and memory consistency errors

  • Concurrency is about dealing with lots of things at once.

    • time sharing, maybe only one thread, not parallel
  • Parallelism is about doing lots of things at once.

fork() create a new process with it's own address space

IPC:

  • Local
  • RPC: remote procedure call

Common problems: Memory-interference, race conditions, deadlock, live lock and starvation

Monitor

Monitors provide a mechanism for threads to temporarily give up exclusive access in order to wait for some condition to be met, before regaining exclusive access and resuming their task.

  • Monitor = a mutex(lock) + condition variables

    • Condition Variable: a container of threads that are waiting on a certain condition.

In multi-threaded programs, it is often useful for a thread to wait for some condition to become true before proceeding.

Solution:

  • spinning until the condition becomes true(inefficient)
  • by condition variable: A condition variable is an explicit queue that threads can put themselves on when waiting for a condition

critical section

if one thread is in the critical section, it excludes the others from entering until it has completed the section.

Lock

implementations: (need hardware support)

  • disable interrupts(does not work on multiprocessors)
  • test-and-set instruction(a.k.a. atomic exchange)
  • compare-and-swap(compare-and-exchange) instruction
  • fetch-and-add instruction

instead of spinning:

  • yield
  • sleep

semaphores

Can be used as both locks and condition variables.

binary semaphore = lock

Lock vs synchronized

The biggest advantage of Lock objects over implicit locks is their ability to back out of an attempt to acquire a lock.

a thread that’s blocked on an intrinsic lock is not interruptible There is no way to interrupt a thread that’s blocked as a result of trying to acquire an intrinsic lock.

There is no way to time out while trying to acquire an intrinsic lock.

There’s exactly one way to acquire an intrinsic lock: a synchronized block.

deadlock on intrinsic lock has to be solved by killing the JVM

Reentrant Lock

reentrant mutex (recursive mutex, recursive lock) is particular type of mutual exclusion (mutex) device that may be locked multiple times by the same process/thread, without causing a deadlock.

Instead of using lock, this code uses tryLock, which times out if it fails to acquire the lock.

Deadlock: it doesn’t avoid deadlock—it simply provides a way to recover when it happens Livelock: if all the threads time out at the same time, it’s quite possible for them to immediately deadlock again.

conditional variable

Deadlock vs Starvation vs Livelock

  • deadlock: dining philosophers, bow
  • starvation,
  • livelock: two people attempting to pass each other in a corridor

Imagine a little dog snuck under Big Dog's cone of shame and covered the food with its own cone of shame, and it won't leave. That's deadlock. Imagine a stream of little dogs sneaking under Big Dog's cone so Big Dog nevers gets a bite. That's livelock.

Green Threads

green threads are threads that are scheduled by a runtime library or virtual machine (VM) instead of natively by the underlying operating system. Green threads emulate multithreaded environments without relying on any native OS capabilities, and they are managed in user space instead of kernel space, enabling them to work in environments that do not have native thread support.

GIL

GIL is within a process, only one thread can run.

The GIL ensures, among other things, that memory access to every object used by CPython is kept “thread-safe,” meaning that a Python object can be used by only one thread at a time. It also means that CPython is effectively single-threaded. Multithreaded, CPU-bound operations either need to be handled by way of C extensions or multiple instances of CPython.

Applications running on implementations with a GIL can be designed to use separate processes to achieve full parallelism no necessity to acquire or release locks on all data structures separately

Some language implementations that implement a global interpreter lock are CPython, the most widely-used implementation of Python, and Ruby MRI, the reference implementation of Ruby (where it is called Global VM Lock). JVM-based equivalents of these languages (Jython and JRuby) do not use global interpreter locks. IronPython and IronRuby are implemented on top of Microsoft's Dynamic Language Runtime and also avoid using a GIL