Think of "container" as just another packaging format.
.iso files for disk images,
.rpm for linux packages, or
.tgz for binary or arbitrary files.
The ecosystem is more than just a format, it includes:
There are 7 namespaces in Linux:
- UTS: UNIX Timesharing System, named after the data structure used to store info returned by
unamesystem call. Isolates hostname and NIS domain name.
VM: On top of Hypervisor, and each VM has its own guest OS
VM1 VM2 |----------|----------| | App | App | |==========|==========| => System Calls | Guest | Guest | | Kernel | Kernel | |----------|----------| | Virtual | Virtual | | Hardware | Hardware | |----------|----------| | Hypervisor(VMM) | |=====================| => System Calls | Host Kernel | |---------------------| | Host Hardware | |---------------------|
(Traditional) Container: Operating system level virtualization. The kernel imposes limits on resources, implemented through use of
|----------|----------| | App | App | |----------|----------| | Container Layer | |---------------------| | Host Kernel | |---------------------| | Host Hardware | |---------------------|
Sandboxed Container(e.g. gVisor): provides a user-space kernel
|----------|----------| | App | App | |==========|==========| => System Calls | gVisor | |=====================| => Limited System Calls | Host Kernel | |---------------------| | Host Hardware | |---------------------|
Defines 2 important specs, so different tools can be used to pack/unpack and run by different runtimes:
- Docker: an open source Linux containerization technology. Package, distribute and runtime solution.
- cgroup: limits and isolates resources(CPU, memory, disk I/O, network, etc)
- containerd: Container Runtime
- rkt: Container Runtime
- gVisor: a user-space kernel for containers. It limits the host kernel surface accessible to the application while still giving the application access to all the features it expects. It leverages existing host kernel functionality and runs as a normal user-space process. For running untrusted workloads. Lower memory and startup overhead compared to a full VM.
Docker's default runtime: runC
$ docker run --runtime=runc ...
gVisor can be integrated with Docker by changing
runsc("run sandboxed container)
$ docker run --runtime=runsc ...
gVisor runs slower than default docker runtime due to the "sandboxing": https://github.com/google/gvisor/issues/102
- Linux Containers (LXC): on top of cgroups. operating system–level virtualization technology for running multiple isolated Linux systems (containers) on a single control host (CoreOS instance).
- cgroups: provides namespace isolation and abilities to limit, account and isolate resource usage (CPU, memory, disk I/O, etc.) of process groups
- LXD: similar to LXC, but a REST API on top of
- Docker: application container; LXC/LXD: system container
- Docker initially used
liblxcbut later changed to
Well it is gaining momentum and popularity. Many companies are adopting it.
Two notable exceptions are: Google and Facebook
Google has its own packaging format: MPM. MPM on Borg is similar to container on Kubernetes, and Kubernetes is the open-source version of Borg.
Facebook use Tupperware. Why not docker or coreos? They didn't exist then.
- tupperware vs docker/kubernetes: https://www.theregister.co.uk/2017/10/23/facebookhatesdockerandkubernetes/
- 2 most important APIs: Images and Container APIs
- Manager quorum: Raft: exchange information with strong consistency
- Worker: Gossip: share information in bulk, converge fast
- Between manager and worker: GRPC(on top of HTTP/2, versioned)
The Docker for Mac application does not use docker-machine to provision that VM; but rather creates and manages it directly.
Compose is a tool for defining and running multi-container Docker applications.
Docker Machine is a tool for provisioning and managing your Dockerized hosts (hosts with Docker Engine on them). Typically, you install Docker Machine on your local system. Docker Machine has its own command line client docker-machine and the Docker Engine client, docker
Docker Engine, the client-server application made up of the Docker daemon, a REST API that specifies interfaces for interacting with the daemon, and a command line interface (CLI) client that talks to the daemon (through the REST API wrapper). Docker Engine accepts docker commands from the CLI, such as
docker run <image>,
docker ps to list running containers,
docker images to list images, and so on.
Unlike traditional virtualization, containerization takes place at the kernel level. Most modern operating system kernels now support the primitives necessary for containerization, including Linux with openvz, vserver and more recently lxc, Solaris with zones, and FreeBSD with Jails.
Because Docker operates at the OS level, it can still be run inside a VM!
Both CMD and ENTRYPOINT instructions define what command gets executed when running a container. There are few rules that describe their co-operation.
Dockerfile should specify at least one of CMD or ENTRYPOINT commands. ENTRYPOINT should be defined when using the container as an executable. CMD should be used as a way of defining default arguments for an ENTRYPOINT command or for executing an ad-hoc command in a container. CMD will be overridden when running the container with alternative arguments.