Building a Weighted Load Balancer in SQL (FAANG-Style)

Weighted load balancing is a core systems problem across large-scale platforms.

It shows up in traffic routing, shard assignment, experiment bucketing, and capacity planning — and it’s a common pattern tested in FAANG SQL interviews.

In this post, we’ll model capacity-aware request distribution using pure SQL, focusing on correctness, scalability, and clean logic — not brute-force tricks.

Problem Statement

You’re given a set of servers with different capacities:

server_id	weight
S1	5
S2	3
S3	2

A higher weight means the server can handle more traffic.

Requirement

For every incoming request:

Traffic should be distributed proportionally to server capacity

Out of every 10 requests:

S1 gets 5
S2 gets 3
S3 gets 2

The solution must be deterministic and scalable

❌ No row expansion by weight

Why Not the Naive Approach?

A common (but flawed) solution is to duplicate rows by weight:


S1 S1 S1 S1 S1 S2 S2 S2 S3 S3

This:

explodes row counts
doesn’t scale when weights are large
is not how real systems think about routing

Instead, we’ll use ranges + modulo logic.

Key Idea (High Level)

Convert server weights into contiguous slot ranges
Convert each request into a repeating slot number
Route the request to the server whose range contains that slot

This mirrors how real load balancers reason about traffic — just expressed in SQL.

Dataset (SQL Server)


DROP TABLE IF EXISTS #servers;
CREATE TABLE #servers (
    server_id VARCHAR(10) PRIMARY KEY,
    weight    INT NOT NULL CHECK (weight > 0)
);

INSERT INTO #servers (server_id, weight)
VALUES
('S1', 5),
('S2', 3),
('S3', 2);


DROP TABLE IF EXISTS #requests;
CREATE TABLE #requests (
    request_id   INT PRIMARY KEY,
    request_time DATETIME2(0) NOT NULL,
    user_id      INT NOT NULL
);

INSERT INTO #requests (request_id, request_time, user_id)
VALUES
( 1, '2026-01-01 10:00:01', 101),
( 2, '2026-01-01 10:00:02', 102),
( 3, '2026-01-01 10:00:03', 103),
( 4, '2026-01-01 10:00:04', 104),
( 5, '2026-01-01 10:00:05', 105),
( 6, '2026-01-01 10:00:06', 106),
( 7, '2026-01-01 10:00:07', 107),
( 8, '2026-01-01 10:00:08', 108),
( 9, '2026-01-01 10:00:09', 109),
(10, '2026-01-01 10:00:10', 110),
(11, '2026-01-01 10:00:11', 111),
(12, '2026-01-01 10:00:12', 112);

Step 1: Build Server Slot Ranges (Using Window Functions)

First, we turn weights into ranges.

Logic

end_slot = running sum of weights
start_slot = end_slot - weight + 1 (math trick)


SELECT
    server_id,
    weight,
    start_slot = SUM(weight) OVER (ORDER BY server_id) - weight + 1,
    end_slot   = SUM(weight) OVER (ORDER BY server_id)
FROM #servers
ORDER BY server_id;

Result

server_id	start_slot	end_slot
S1	1	5
S2	6	8
S3	9	10

These ranges are non-overlapping and contiguous, which is critical.

Step 2: Assign Each Request a Slot (Modulo Logic)

Next, we convert each request into a slot number from 1 to total_weight.

Formula


slot = ((request_id - 1) % total_weight) + 1

Why this works:

% (modulo) creates a repeating cycle
subtracting 1 aligns counting with modulo’s 0-based behavior
adding 1 shifts back to human-friendly slots

Example

request_id	slot
1	1
2	2
...	...
10	10
11	1
12	2

Final Query: Weighted Routing (Single CTE, Clean Version)

This is the FAANG-style final solution.


WITH routed_requests AS (
    SELECT
        r.request_id,
        r.request_time,
        r.user_id,

        -- Convert request into repeating slot
        slot =
            ((r.request_id - 1)
              % (SELECT SUM(CAST(weight AS BIGINT)) FROM #servers)
            ) + 1,

        s.server_id AS routed_server
    FROM #requests r
    JOIN (
        SELECT
            server_id,
            weight,
            end_slot   = SUM(CAST(weight AS BIGINT)) OVER (ORDER BY server_id),
            start_slot = SUM(CAST(weight AS BIGINT)) OVER (ORDER BY server_id)
                         - CAST(weight AS BIGINT) + 1
        FROM #servers
    ) s
      ON (
           ((r.request_id - 1)
             % (SELECT SUM(CAST(weight AS BIGINT)) FROM #servers)
           ) + 1
         ) BETWEEN s.start_slot AND s.end_slot
)
SELECT
    request_id,
    request_time,
    user_id,
    slot,
    routed_server
FROM routed_requests
ORDER BY request_id;

Why This Solution Is Interview-Grade

✅ No row expansion
✅ Deterministic routing
✅ Uses window functions correctly
✅ Scales to large weights
✅ Mirrors real-world system design

This is systems thinking expressed in SQL.

Common Interview Follow-Ups

What happens if a server goes down?
→ Recompute ranges using only active servers.
How do you minimize reshuffling when servers change?
→ Move to consistent hashing.
Why not use NTILE?
→ NTILE distributes evenly, not proportionally by weight.

Final Takeaway

SQL isn’t just for reporting.

When used correctly, it’s a powerful way to model distribution, fairness, and correctness — exactly what FAANG interviews look for.

If you understand this pattern, you’re not just writing SQL —
you’re reasoning like a systems engineer.