System Design Interview: URL Shortener
Complete system design interview question for building a URL shortener like bit.ly, including requirements gathering, architecture, and trade-offs.
System Design Interview: URL Shortener
Design a URL shortening service like bit.ly or TinyURL. This is a classic system design interview question that tests understanding of scalability, distributed systems, and trade-offs.
Interview Format (45 minutes)
Time Allocation:
- Requirements gathering: 5-8 minutes
- High-level design: 10-15 minutes
- Deep dive: 15-20 minutes
- Scale and edge cases: 5-10 minutes
Step 1: Requirements Gathering (5-8 min)
A strong candidate will ask clarifying questions before jumping to solutions.
Functional Requirements
Good questions to ask:
- What's the primary use case? (shortening URLs)
- Do shortened URLs expire? (yes, configurable TTL)
- Can users customize short URLs? (yes, but most are auto-generated)
- Do we need analytics? (yes, click tracking)
- Do we need user accounts? (optional, focus on core first)
Agreed requirements:
- Given a long URL, return a short URL
- Given a short URL, redirect to original long URL
- URLs expire after configurable time (default: 1 year)
- Track basic analytics (click count)
- Custom short codes (optional)
Non-Functional Requirements
Good questions to ask:
- Expected scale? (100M URLs, 10B redirects/month)
- Read/write ratio? (read-heavy, 100:1)
- Latency requirements? (redirects <100ms)
- Availability requirements? (99.9% uptime)
- Geographic distribution? (global)
Agreed requirements:
- High availability (redirects critical)
- Low latency (<100ms for redirects)
- Scalable (10B redirects/month = ~3,800 req/sec average, 20K peak)
- Durable (URLs shouldn't be lost)
Calculations
Storage:
100M URLs/year
500 bytes per URL record (original URL + metadata)
500 bytes × 100M = 50GB/year
Over 5 years = 250GB (very manageable)
Bandwidth:
10B redirects/month
Assume 200 byte response
10B × 200 bytes = 2TB/month = 66GB/day
Peak: 20K req/sec × 200 bytes = 4MB/sec
QPS:
Average: 10B/month = 3,800 req/sec
Peak (assume 2x): ~8,000 req/sec
Red flags if candidate:
- Doesn't ask clarifying questions
- Jumps straight to implementation details
- Doesn't think about scale
- Ignores read/write ratio
Step 2: High-Level Design (10-15 min)
API Design
POST /api/shorten
Request:
{
"url": "https://example.com/very/long/url",
"customCode": "my-link", // optional
"expiresAt": "2026-01-01" // optional
}
Response:
{
"shortUrl": "https://short.ly/abc123",
"code": "abc123",
"expiresAt": "2026-01-01T00:00:00Z"
}
GET /{code}
Response: 302 Redirect to original URL
Good candidate discusses:
- RESTful API design
- HTTP 302 (temporary) vs 301 (permanent) redirect
- Error handling (expired, not found)
Core Components
┌─────────────┐
│ Client │
└──────┬──────┘
│
┌───▼────────┐
│ Load │
│ Balancer │
└───┬────────┘
│
┌───▼──────────────┐
│ API Servers │
│ (Stateless) │
└───┬──────────────┘
│
┌───▼───────────────┐
│ Database │
│ (URL mappings) │
└───────────────────┘
Tables:
CREATE TABLE urls (
id BIGSERIAL PRIMARY KEY,
short_code VARCHAR(10) UNIQUE NOT NULL,
long_url TEXT NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
expires_at TIMESTAMP,
click_count INTEGER DEFAULT 0,
user_id INTEGER -- optional
);
CREATE INDEX idx_short_code ON urls(short_code);
CREATE INDEX idx_expires_at ON urls(expires_at) WHERE expires_at IS NOT NULL;
Step 3: Deep Dive (15-20 min)
Short Code Generation
Approach 1: Random + Collision Check
import random
import string
def generate_short_code(length=6):
characters = string.ascii_letters + string.digits # 62 characters
while True:
code = ''.join(random.choices(characters, k=length))
# Check if exists
if not db.exists(code):
return code
Pros:
- Simple to implement
- Evenly distributed
Cons:
- Requires database check (latency)
- Collision rate increases over time
- Not suitable for high-write scenarios
Math:
62^6 = ~57 billion possible codes
After 100M URLs (< 0.2% full), collision probability is low
But still requires DB check
Approach 2: Auto-incrementing ID + Base62 Encoding
BASE62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
def encode_base62(num):
if num == 0:
return BASE62[0]
result = []
while num > 0:
result.append(BASE62[num % 62])
num //= 62
return ''.join(reversed(result))
def decode_base62(code):
num = 0
for char in code:
num = num * 62 + BASE62.index(char)
return num
# Example
id = 12345
code = encode_base62(id) # "dnh"
Pros:
- No collisions
- No database lookup needed
- Deterministic
Cons:
- Sequential (predictable)
- Exposes number of URLs
- Doesn't work with custom codes
Approach 3: Hash Function (MD5/SHA256)
import hashlib
def generate_short_code(url):
# Generate hash
hash_digest = hashlib.md5(url.encode()).hexdigest()
# Take first 6 characters
code = hash_digest[:6]
# Handle collisions
if db.exists(code):
# Add counter to URL and retry
code = hash_digest[:6] + str(counter)
return code
Pros:
- Same URL always gets same code
- Can deduplicate URLs
Cons:
- Collisions possible
- Still need database check
- Longer codes to reduce collisions
Recommended Approach: Combination
def generate_short_code(url, custom_code=None):
if custom_code:
# Validate and use custom code
if db.exists(custom_code):
raise ValueError("Custom code already exists")
return custom_code
# Use auto-increment + base62 for generated codes
next_id = get_next_id() # from ID generator service
return encode_base62(next_id)
Strong candidate discusses:
- Trade-offs of each approach
- Collision handling
- Scalability implications
- Custom code support
ID Generation at Scale
Problem: Single database auto-increment doesn't scale
Solution: Distributed ID Generation
Option 1: UUID
import uuid
id = uuid.uuid4() # 128-bit
Pros: No coordination needed Cons: 128-bit is too long for short URLs
Option 2: Twitter Snowflake
64-bit ID structure:
[1 bit unused][41 bits timestamp][10 bits machine ID][12 bits sequence]
Allows:
- 2^41 milliseconds (~69 years)
- 2^10 machines (1024)
- 2^12 IDs per millisecond per machine (4096)
class SnowflakeIDGenerator:
def __init__(self, machine_id):
self.machine_id = machine_id
self.sequence = 0
self.last_timestamp = -1
def generate_id(self):
timestamp = current_timestamp_ms()
if timestamp == self.last_timestamp:
self.sequence = (self.sequence + 1) & 0xFFF
if self.sequence == 0:
# Wait for next millisecond
timestamp = wait_next_millis(timestamp)
else:
self.sequence = 0
self.last_timestamp = timestamp
return ((timestamp << 22) |
(self.machine_id << 12) |
self.sequence)
Option 3: Pre-generated Range
# Coordination service assigns ranges
# Server 1: 1-100,000
# Server 2: 100,001-200,000
class RangeIDGenerator:
def __init__(self):
self.current = get_range_from_coordinator() # 1-100,000
self.end = self.current + 100000
def generate_id(self):
if self.current >= self.end:
# Get new range
self.current = get_range_from_coordinator()
self.end = self.current + 100000
self.current += 1
return self.current
Best for URL shortener: Snowflake or Range-based
Caching Strategy
Read-heavy workload (100:1 ratio) → Perfect for caching
Cache hit ratio target: >99%
Implementation:
def redirect(short_code):
# Check cache first
long_url = cache.get(f"url:{short_code}")
if long_url is None:
# Cache miss: fetch from database
url_record = db.get_by_code(short_code)
if url_record is None or url_record.expired():
return 404
long_url = url_record.long_url
# Cache for 24 hours
cache.set(f"url:{short_code}", long_url, ttl=86400)
# Async analytics update
queue.publish("click", {
"code": short_code,
"timestamp": now()
})
return redirect_to(long_url)
Cache invalidation:
# When URL expires or is deleted
def delete_url(short_code):
db.delete(short_code)
cache.delete(f"url:{short_code}")
Strong candidate discusses:
- Cache-aside pattern
- TTL strategy
- Invalidation approach
- Cache warming for hot URLs
Analytics
Challenge: Updating click_count on every request kills performance
Solution 1: Async Updates
# Write to queue instead of DB
def track_click(short_code):
queue.publish("analytics", {
"code": short_code,
"timestamp": now(),
"user_agent": request.user_agent,
"ip": request.ip
})
# Consumer batches updates
def analytics_consumer():
while True:
messages = queue.consume(batch_size=1000)
# Batch update
updates = Counter(msg["code"] for msg in messages)
db.batch_update_click_counts(updates)
Solution 2: Write to Time-Series DB
# Use InfluxDB or TimescaleDB
def track_click(short_code):
influxdb.write_point(
measurement="clicks",
tags={"code": short_code},
fields={"count": 1},
timestamp=now()
)
# Query for analytics
def get_clicks(short_code, start_date, end_date):
return influxdb.query(f"""
SELECT SUM(count)
FROM clicks
WHERE code = '{short_code}'
AND time >= '{start_date}'
AND time <= '{end_date}'
""")
Step 4: Scale and Edge Cases (5-10 min)
Scaling to Billions
Updated Architecture:
┌────────────────────────────────────────────────┐
│ CDN / Cloudflare │
└────────────────────┬───────────────────────────┘
│
┌────────────────────▼───────────────────────────┐
│ Load Balancer (Global) │
└────────────────────┬───────────────────────────┘
│
┌────────────┼────────────┐
│ │ │
┌────▼────┐ ┌───▼─────┐ ┌──▼──────┐
│ API │ │ API │ │ API │
│Server 1 │ │Server 2 │ │Server 3 │
└────┬────┘ └───┬─────┘ └──┬──────┘
│ │ │
└────────┬──┴───────────┘
│
┌────────▼──────────┐
│ Redis Cluster │
│ (Cache Layer) │
└────────┬──────────┘
│
┌────────▼──────────┐
│ PostgreSQL │
│ (Sharded) │
│ - Shard 1: a-m │
│ - Shard 2: n-z │
└───────────────────┘
Database Sharding:
def get_shard(short_code):
# Hash-based sharding
shard_count = 16
shard_id = hash(short_code) % shard_count
return f"shard_{shard_id}"
def get_url(short_code):
shard = get_shard(short_code)
return db_connections[shard].get(short_code)
Rate Limiting
from redis import Redis
def check_rate_limit(ip_address):
key = f"ratelimit:{ip_address}:{current_minute()}"
count = redis.incr(key)
if count == 1:
redis.expire(key, 60)
if count > 100: # 100 requests per minute
raise RateLimitExceeded()
Security Considerations
1. Malicious URLs
BLACKLIST_DOMAINS = ["malware.com", "phishing.com"]
def validate_url(url):
domain = extract_domain(url)
if domain in BLACKLIST_DOMAINS:
raise ValueError("URL not allowed")
# Check against safe browsing API
if not safe_browsing_check(url):
raise ValueError("URL flagged as unsafe")
2. Abuse Prevention
# Limit URLs per IP
def check_creation_limit(ip_address):
key = f"create:{ip_address}:{current_day()}"
count = redis.incr(key)
if count == 1:
redis.expire(key, 86400)
if count > 10: # 10 URLs per day per IP
raise TooManyURLs()
3. DDoS Protection
- Rate limiting
- CAPTCHA for suspicious traffic
- WAF (Web Application Firewall)
Monitoring and Alerts
# Metrics to track
metrics = {
"api.latency.p95": target_ms < 100,
"api.error_rate": target < 0.1%,
"cache.hit_rate": target > 99%,
"db.query_time.p95": target < 50ms,
"redirect.success_rate": target > 99.9%
}
# Alerts
if cache_hit_rate < 95%:
alert("Cache hit rate dropped")
if p95_latency > 200ms:
alert("Latency degraded")
Evaluation Rubric
Strong Performance (Hire)
- ✅ Asks clarifying questions
- ✅ Considers scale from the start
- ✅ Discusses multiple approaches with trade-offs
- ✅ Designs scalable solution
- ✅ Handles edge cases
- ✅ Thinks about operations (monitoring, debugging)
- ✅ Clear communication and diagrams
Adequate Performance (Maybe)
- ✅ Functional design
- ✅ Some scaling considerations
- ⚠️ Limited exploration of alternatives
- ⚠️ Misses some edge cases
- ✅ Can be guided to better solutions
Weak Performance (No Hire)
- ❌ Jumps to solution without requirements
- ❌ Doesn't consider scale
- ❌ Can't explain trade-offs
- ❌ Monolithic thinking
- ❌ Poor communication
Follow-up Questions
For senior candidates:
- How would you handle 1 trillion URLs?
- Design the analytics system in detail
- How would you migrate this system with zero downtime?
- How would you handle geographic distribution?
For staff+ candidates:
- Design the organization structure for this service
- How would you ensure 99.99% uptime?
- Design the disaster recovery plan
- How would you handle compliance (GDPR, data retention)?
This question tests foundational distributed systems concepts: caching, sharding, ID generation, and trade-offs. A strong candidate will navigate these topics methodically while engaging in technical discussion.