How to crack a System Design Interview Series— Design a Unique ID Generator in Distributed Systems
Suppose you are asked to design a unique ID generator in distributed system. Your first thought might be to use a primary key with the auto_increment attribute in a relational database. However, auto_increment does not work in a distributed environment because a single database server is not large enough and generating unique IDs across multiple databases with minimal delay is challenging.
Few examples of unique IDs:
Step 1 — Understand the problem and establish the design scope
Asking clarification questions is the first step to tackle any system design interview question. Here is an example of few clarification questions:
Candidate: What are the characteristics of unique IDs?
Interviewer: IDs must be unique and sortable.
Candidate: For each new record, does ID increment by 1?
Interviewer: The ID increments by time but not necessarily only increments by 1. IDs created in the evening are larger than those created in the morning on the same day.
Candidate: Do IDs only contain numerical values?
Interviewer: Yes, that is correct.
Candidate: What is the ID length requirement?
Interviewer: IDs should fit into 64-bit.
Candidate: What is the scale of the system?
Interviewer: The system should be able to generate 10k IDs per second.
Above are some of the sample questions that you can ask your
interviewer. It is important to understand the requirements and clarify
ambiguities. For this interview question, the requirements are listed as
follows:
- IDs must be unique.
- IDs are numerical values only.
- IDs fit into 64-bit.
- IDs are ordered by date.
- Ability to generate over 10,000 unique [Ds per second.
Step 2 — Propose high-level design and get buy-in
Multiple options can be used to generate unique IDs in distributed systems. The options we considered are:
- Multi-master replication
- Universally unique identifier (UUID)
- Ticket server
- Twitter Snowflake approach
Press the follow button and be a master software developer by getting reading a tech article daily.
Step 3 — Design Deep Dive
Each approach is explained in detail individually.
Step 4 — Smooth Wrap up
We discussed different approaches to design a unique ID generator: multi-master replication, UUID, Ticket Server and Twitter SnowFlake-like unique ID generator. We settled on SnowFlake as it supports all our use cases and is scalable in a distributed environment.
If there is extra time at the end of the interview, here are few additional talking points:
- Clock Synchronization — In our design, we assume ID generation servers have the same clock. This assumption might not be true when a server is running on multiple cores. The same challenge exists in multi-machine scenarios. You don’t need to discuss the solutions to this problem, it’ll increase the complexity of the interview and might set a different path of discussion altogether. Network Time Protocol is the most popular solution to the this problem. For interested readers, refer to the reference materials below.
- Section Length Tuning — For example, fewer sequence numbers but more timestamp bits are effective for low concurrency and long term applications.
- High Availability — Since an ID generator is a very critical system and many different systems depend on it, it must be highly available.
If you have reached this far. Sit back and relax. Congratulations!!! You have cracked the interview. Great job!!!
References —
Please, follow #tech-granth