Network Programming Models
Imagine that you want to write a network server. You have a thread which sits in
accept() listening for incoming connections. A very simple single-threaded server might wake up with a connection,
read() the incoming request,
write() a response to that request, then
close() the socket and call
This approach has a major problem if you wish to service multiple requests concurrently. In particular,
write() are (by default) blocking functions. They do not return until data/socket buffer is available.
close() then blocks until the data is actually sent. This means that a single slow (or malicious!) client can block the server from answering requests by other clients.
One way to address this problem is to use a thread-per-connection or thread-per-request model. You have a single thread sitting in
accept(), and then when a request comes in you start a new thread which handles the request. You can even have a pool of already started threads so you do not need to start and stop a new thread on every request.
This model works fairly well and is entirely legitimate, especially when used with a threading library which scales to many threads. It may have problems scaling to tens of thousands of simultaneous connections, depending on per-thread overhead. In this case it can help to limit the number of total threads -- it probably does not help to have thousands of them.
It turns out these threads are not necessary -- it is possible to write such a server with a single thread handling an arbitrary number of concurrent connections (limited by CPU). In particular, we can tell the operating system to make reads and writes on a socket non-blocking, meaning that calls to functions like
write() return immediately. If, for example, you try to issue a
read() and there is no data on the socket you get an error (
With this you can handle several sockets in a single thread, by “polling” them with
write() calls until they are ready. This is wasteful and does not scale very well, so there is a function,
poll(), which lets you block (“select”) on any of a number of sockets becoming readable or writable. This can let a single thread handle thousands of connections at once; it can call
poll(), deal with events, then call back into
poll() to wait for more. This pattern is called an event loop. This is particularly useful for servers where the overhead of a thread per connection is significant, such as ones with very lightweight requests or with long-lived, relatively idle connections.
One problem with
poll(), a major issue when dealing with tens of thousands of mostly-idle connections, is that it takes the entire list of file descriptors on every call to
poll(). If only a few sockets have events the cost of the huge
poll() calls may be significant. Linux has an alternative API,
epoll, which is similar to
poll() but maintains the list of file descriptors in the kernel.