Building a Weighted Load Balancer in SQL (FAANG-Style)

Weighted load balancing is a core systems problem across large-scale platforms.

It shows up in traffic routing, shard assignment, experiment bucketing, and capacity planning — and it’s a common pattern tested in FAANG SQL interviews.

In this post, we’ll model capacity-aware request distribution using pure SQL, focusing on correctness, scalability, and clean logic — not brute-force tricks.




Problem Statement

You’re given a set of servers with different capacities:

server_idweight
S15
S23
S32

A higher weight means the server can handle more traffic.

Requirement

For every incoming request:

Traffic should be distributed proportionally to server capacity

Out of every 10 requests:

  • S1 gets 5
  • S2 gets 3
  • S3 gets 2

The solution must be deterministic and scalable
  • ❌ No row expansion by weight


Why Not the Naive Approach?

A common (but flawed) solution is to duplicate rows by weight:

S1 S1 S1 S1 S1 S2 S2 S2 S3 S3

This:

  • explodes row counts
  • doesn’t scale when weights are large
  • is not how real systems think about routing

Instead, we’ll use ranges + modulo logic.


Key Idea (High Level)

  1. Convert server weights into contiguous slot ranges
  2. Convert each request into a repeating slot number
  3. Route the request to the server whose range contains that slot

This mirrors how real load balancers reason about traffic — just expressed in SQL.


Dataset (SQL Server)

DROP TABLE IF EXISTS #servers; CREATE TABLE #servers ( server_id VARCHAR(10) PRIMARY KEY, weight INT NOT NULL CHECK (weight > 0) ); INSERT INTO #servers (server_id, weight) VALUES ('S1', 5), ('S2', 3), ('S3', 2); DROP TABLE IF EXISTS #requests; CREATE TABLE #requests ( request_id INT PRIMARY KEY, request_time DATETIME2(0) NOT NULL, user_id INT NOT NULL ); INSERT INTO #requests (request_id, request_time, user_id) VALUES ( 1, '2026-01-01 10:00:01', 101), ( 2, '2026-01-01 10:00:02', 102), ( 3, '2026-01-01 10:00:03', 103), ( 4, '2026-01-01 10:00:04', 104), ( 5, '2026-01-01 10:00:05', 105), ( 6, '2026-01-01 10:00:06', 106), ( 7, '2026-01-01 10:00:07', 107), ( 8, '2026-01-01 10:00:08', 108), ( 9, '2026-01-01 10:00:09', 109), (10, '2026-01-01 10:00:10', 110), (11, '2026-01-01 10:00:11', 111), (12, '2026-01-01 10:00:12', 112);

Step 1: Build Server Slot Ranges (Using Window Functions)

First, we turn weights into ranges.

Logic

  • end_slot = running sum of weights
  • start_slot = end_slot - weight + 1 (math trick)

SELECT server_id, weight, start_slot = SUM(weight) OVER (ORDER BY server_id) - weight + 1, end_slot = SUM(weight) OVER (ORDER BY server_id) FROM #servers ORDER BY server_id;

Result

server_idstart_slotend_slot
S115
S268
S3910

These ranges are non-overlapping and contiguous, which is critical.


Step 2: Assign Each Request a Slot (Modulo Logic)

Next, we convert each request into a slot number from 1 to total_weight.

Formula

slot = ((request_id - 1) % total_weight) + 1

Why this works:

  • % (modulo) creates a repeating cycle
  • subtracting 1 aligns counting with modulo’s 0-based behavior
  • adding 1 shifts back to human-friendly slots

Example

request_idslot
11
22
......
1010
111
122

Final Query: Weighted Routing (Single CTE, Clean Version)

This is the FAANG-style final solution.

WITH routed_requests AS ( SELECT r.request_id, r.request_time, r.user_id, -- Convert request into repeating slot slot = ((r.request_id - 1) % (SELECT SUM(CAST(weight AS BIGINT)) FROM #servers) ) + 1, s.server_id AS routed_server FROM #requests r JOIN ( SELECT server_id, weight, end_slot = SUM(CAST(weight AS BIGINT)) OVER (ORDER BY server_id), start_slot = SUM(CAST(weight AS BIGINT)) OVER (ORDER BY server_id) - CAST(weight AS BIGINT) + 1 FROM #servers ) s ON ( ((r.request_id - 1) % (SELECT SUM(CAST(weight AS BIGINT)) FROM #servers) ) + 1 ) BETWEEN s.start_slot AND s.end_slot ) SELECT request_id, request_time, user_id, slot, routed_server FROM routed_requests ORDER BY request_id;

Why This Solution Is Interview-Grade

  • ✅ No row expansion
  • ✅ Deterministic routing
  • ✅ Uses window functions correctly
  • ✅ Scales to large weights
  • ✅ Mirrors real-world system design

This is systems thinking expressed in SQL.


Common Interview Follow-Ups

  • What happens if a server goes down?
  • → Recompute ranges using only active servers.
  • How do you minimize reshuffling when servers change?
  • → Move to consistent hashing.
  • Why not use NTILE?
  • → NTILE distributes evenly, not proportionally by weight.

Final Takeaway

SQL isn’t just for reporting.

When used correctly, it’s a powerful way to model distribution, fairness, and correctness — exactly what FAANG interviews look for.

If you understand this pattern, you’re not just writing SQL —
you’re reasoning like a systems engineer.

Post a Comment

Previous Post Next Post

Contact Form