Distributed System Design Interview Questions
These are the common types of the design questions, though some questions may cover more than one aspects.
Design the Objects and their interactions, the most common examples are "design a elevator" and "design a garage".
This is about how data would be stored and retrieved. You can choose SQL or NoSQL databases, then design the schema, how to setup index and how to query the data(join, filter, aggregate, etc). Expect the requirements to be changed midway, this would be a test of the flexibility of your schema.
Example can be designing a book sharing system that user can lend book to others or borrow from others.
Understand the verbs of HTTP, design the end points.
Similar to GraphQL, if you list it on your resume.
How to scale up a system: replicas, sharding, load balancer, cache, etc.
- Concurrency. Do you understand threads, deadlock, and starvation? Do you know how to parallelize algorithms? Do you understand consistency and coherence?
- Networking. Do you roughly understand IPC and TCP/IP? Do you know the difference between throughput and latency, and when each is the relevant factor?
- Abstraction. You should understand the systems you’re building upon. Do you know roughly how an OS, file system, and database work? Do you know about the various levels of caching in a modern OS?
- Real-World Performance. You should be familiar with the speed of everything your computer can do, including the relative performance of RAM, disk, SSD and your network.
- Estimation. Estimation, especially in the form of a back-of-the-envelope calculation, is important because it helps you narrow down the list of possible solutions to only the ones that are feasible. Then you have only a few prototypes or micro-benchmarks to write.
- Availability and Reliability. Are you thinking about how things can fail, especially in a distributed environment? Do know how to design a system to cope with network failures? Do you understand durability?
Design a distributed cache/hash (the most fundamental questions, could be the building block for other questions.)
- top k: top k requests or videos or music
- Design a tinyurl service
- Design typeahead in search: could be google search, could be facebook friend search, different optimizations
- Design a search engine(status search)
- Design load balancer
- Design live comment/twitter feed/facebook feed
- Design an elevator control system
- Design a distributed unique ID service
- System Design for Big Data-tinyurl
- What are best practices for building something like a News Feed?
- What are the scaling issues to keep in mind while developing a social network feed?
- Activity Feeds Architecture
- Efficient Computation of Frequent and Top-k Elements in Data Streams
- An Optimal Strategy for Monitoring Top-k Queries in Streaming Windows
- How to Create an Asynchronous Multiplayer Game
- How to Create an Asynchronous Multiplayer Game Part 2: Saving the Game State to Online Database
- How to Create an Asynchronous Multiplayer Game Part 3: Loading Games from the Database
- How to Create an Asynchronous Multiplayer Game Part 4: Matchmaking
- Real Time Multiplayer in HTML5
- Building out the infrastructure for Graph Search
- Indexing and ranking in Graph Search
- The natural language interface of Graph Search and Erlang at Facebook
- Implementing Real-Time Trending Topics With a Distributed Rolling Count Algorithm in Storm
- Early detection of Twitter trends explained
- How would you design the feature in LinkedIn where it computes how many hops there are between you and another person?
- If you were to design a web platform for online chess games, how would you do that?
- Design an ID allocator which can allocate and de-allocate from a range of 1-1,000,000
- Design and implement a web crawler(single and multi-threaded)
The most straight forward way
Long value; value++
- Thread-safe? -No.
- Thread-safe? -Safe
Network failure? -Not safe
- Double increments: receiver received increment request, performed increment; sender got timeout or network failure, resend the request
Use a streaming solution like Spark Streaming: store counts in an RDD that can be incremented in a reduce process.
Grokking the System Design Interview https://www.educative.io/collection/5668639101419520/5649050225344512 https://www.interviewbit.com/problems/search-typeahead/ type ahead: https://www.facebook.com/notes/facebook-engineering/the-life-of-a-typeahead-query/389105248919/