Network Programming Models
Basic Approach
Imagine that you want to write a network server. You have a thread which sits in accept()
listening for incoming connections. A very simple single-threaded server might wake up with a connection, read()
the incoming request, write()
a response to that request, then close()
the socket and call accept()
again.
This approach has a major problem if you wish to service multiple requests concurrently. In particular, read()
and write()
are (by default) blocking functions. They do not return until data/socket buffer is available. close()
then blocks until the data is actually sent. This means that a single slow (or malicious!) client can block the server from answering requests by other clients.
Thread-per-connection
One way to address this problem is to use a thread-per-connection or thread-per-request model. You have a single thread sitting in accept()
, and then when a request comes in you start a new thread which handles the request. You can even have a pool of already started threads so you do not need to start and stop a new thread on every request.
This model works fairly well and is entirely legitimate, especially when used with a threading library which scales to many threads. It may have problems scaling to tens of thousands of simultaneous connections, depending on per-thread overhead. In this case it can help to limit the number of total threads -- it probably does not help to have thousands of them.
Event model
It turns out these threads are not necessary -- it is possible to write such a server with a single thread handling an arbitrary number of concurrent connections (limited by CPU). In particular, we can tell the operating system to make reads and writes on a socket non-blocking, meaning that calls to functions like read()
and write()
return immediately. If, for example, you try to issue a read()
and there is no data on the socket you get an error (EAGAIN
).
With this you can handle several sockets in a single thread, by “polling” them with read()
or write()
calls until they are ready. This is wasteful and does not scale very well, so there is a function, poll()
, which lets you block (“select”) on any of a number of sockets becoming readable or writable. This can let a single thread handle thousands of connections at once; it can call poll()
, deal with events, then call back into poll()
to wait for more. This pattern is called an event loop. This is particularly useful for servers where the overhead of a thread per connection is significant, such as ones with very lightweight requests or with long-lived, relatively idle connections.
Aside: epoll
One problem with poll()
, a major issue when dealing with tens of thousands of mostly-idle connections, is that it takes the entire list of file descriptors on every call to poll()
. If only a few sockets have events the cost of the huge poll()
calls may be significant. Linux has an alternative API, epoll
, which is similar to poll()
but maintains the list of file descriptors in the kernel.