Kubernetes - Networking
- "IP-per-pod" model: each
Podhas a unique IP; a
Podis equivalent to a VM or a host (which have unique IPs). Pods can be load-balanced.
- Containers within a Pod use networking to communicate via loopback. Container to container: use
- Pods can communicate with all other pods on any other node without NAT.
- Agents on a node (e.g. system daemons, kubelet) can communicate with all pods on that node.
- Isolation (restricting what each pod can communicate with) is defined using
NetworkPolicy. Network policies are implemented by the network (CNI) plugins.
- Upper Networking: kubernetes defined networking.
- pod to internet traffic needs to go through a NAT Gateway.
There are two components to networking.
- Kubernetes cluster networking, i.e., pod-to-pod connectivity, or "data plane". Can be provided by bundling Cilium (eBPF based programmable dataplane) (run Cilium in overlay mode if you do not require L2 connectivity between all nodes or an SDN).
- L4 load balancing: can be provided by bundling MetalLB.
Kubernetes Engine clusters are provisioned with an IP range that is automatically determined during cluster creation, in an existing project subnet.
Since Kubernetes 1.24, management of the CNI is no longer in scope for
kubelet. CNI plugins are managed by a Container Runtime (e.g. containerd).
CNI is used by container runtimes: The container/pod initially has no network interface. CNI is used so the pods can accept traffic directly, which keeps network latency as low as possible.
- When the container runtime expects to perform network operations on a container,
kubeletcalls the CNI plugin with the desired command.
- The container runtime also provides related network configuration and container-specific data to the plugin.
- The CNI plugin performs the required operations and reports the result.
- Flannel and Weavenet: easy setup and configuration.
- Calico: better for performance since it uses an underlay network through BGP.
- Cilium: utilizes a completely different application-layer filtering model through BPF and is more geared towards enterprise security. Can replace kube-proxy (with iptables). eBPF is superior to iptables.
GKE Dataplane V2 migrated from Calico (Calico CNI and Calico Network Policies, rely heavily on IPtables functionality in the Linux kernel, IPtables provide a flexible, but not programmable datapath that enables K8s networking functions) to a programmable datapath. based on eBPF/Cilium.
anetd is the networking controller, which replaces
anetd/cilium holds metadata (network policies, configured k8s services and their endpoints) and accounts metrics (conntrack entries, dropped, forwarded traffic) related to all networking in the nodes.
Pod defined in the cluster (including the DNS server itself) is assigned a DNS name. You can contact
Services with consistent DNS names instead of IP addresses.
kubeadm v1.24, the only supported cluster DNS application is
CoreDNS. (Support for
kube-dns was removed.)
- L4 Load Balancer: should support TCP/UDP loadbalancing and high availability. It should work in a world where nodes are in different L2 subnets. E.g.
- metalLB for workloads.
- keepalived + haproxy for control plane nodes.
- L7 Load Balancer: e.g. via Istio ingress gateway
Get metrics from the network switches
snmp-exporter to expose the SNMP data to Prometheus.
SNMP is a known resource hog.
(Ingress api version 2)
The Gateway API is a SIG-Network project being built to improve and standardize service networking in Kubernetes.
apiVersion: networking.istio.io/v1beta1 kind: Gateway
Gateway describes a load balancer operating at the edge of the mesh receiving incoming or outgoing HTTP/TCP connections.
Defines exposed ports and protocols
Gateway is backed by a
Service of type
configure the L7 LoadBalancing or reverse proxy, split traffic e.g. if uri prefix is
/api/, go to service 1, if
/ go to service 2.
apiVersion: networking.istio.io/v1beta1 kind: VirtualService
VirtualService can be bound to a gateway to control the forwarding of traffic arriving at a particular host or gateway port.