Understanding employee-manager relationships is crucial for building organizational insights. What if you need more than just an employee's immediate manager, like an entire chain of command? SQL makes it possible. In this tutorial, we'll explore how to use recursive queries to uncover every layer of an employee-manager hierarchy, step by step.
The Goal: Mapping the Complete Hierarchy
Typically, an employee table contains details like EmployeeID
, EmployeeName
, and ManagerID
. While a simple self-join helps identify an employee's immediate manager, it falls short when you need to explore multiple levels of hierarchy.
For example, Employee A reports to Manager B, and Manager B reports to Leader C. A self-join only fetches the immediate connection (A → B). But what if you need to trace A's hierarchy all the way to Leader C? This is where recursive queries shine.
Hierarchy Levels Explained
In an organization's structure, hierarchy levels can be defined as:
- Level 0: The top boss without any manager above them.
- Level 1: Employees reporting directly to the top boss.
- Level 2: Employees reporting to Level 1 managers.
- Level 3+: Subsequent levels of reporting, and so on.
Step 1: A Quick Recap of Self-Joins
Before jumping into recursion, let’s revisit how self-joins work to find an employee’s manager.
SELECT emp.EmployeeName AS Employee,
mgr.EmployeeName AS Manager
FROM Employee AS emp
LEFT JOIN Employee AS mgr
ON emp.ManagerID = mgr.EmployeeID;
This query links the table to itself using the ManagerID
from one instance and the EmployeeID
from the other. It fetches the manager of each employee but doesn't dive deeper into the hierarchy.
Step 2: Setting Up for Recursive Queries
Recursive queries rely on a Common Table Expression (CTE), which breaks down large operations into smaller, logical chunks. Here’s how we can start:
- Base Case: Identify the topmost manager (Level 0). This is anyone with a
NULL
ManagerID
.
WITH EmployeeManagerCTE AS (
SELECT EmployeeID,
EmployeeName,
ManagerID,
0 AS EmployeeLevel
FROM Employee
WHERE ManagerID IS NULL
)
This part selects the top-level boss, assigning them EmployeeLevel = 0
.
Step 3: Writing the Recursive Part
Now, for the hierarchy beneath the top boss, recursion comes into play. We use the output of the CTE
to repeatedly fetch employees reporting to the managers identified in the previous step.
Here’s how it looks:
UNION ALL
SELECT emp.EmployeeID,
emp.EmployeeName,
emp.ManagerID,
mgr.EmployeeLevel + 1 AS EmployeeLevel
FROM Employee AS emp
INNER JOIN EmployeeManagerCTE AS mgr
ON emp.ManagerID = mgr.EmployeeID
How It Works:
- The
INNER JOIN
ties each employee’sManagerID
to theEmployeeID
from the last CTE output. - The
EmployeeLevel
increases by1
with each step further down the hierarchy.
Step 4: Retrieving the Results
Finally, we query the CTE to retrieve all employees with their respective hierarchy levels:
SELECT *
FROM EmployeeManagerCTE
ORDER BY EmployeeLevel;
Complete Query
Putting it all together, the full recursive query looks like this:
WITH EmployeeManagerCTE AS (
SELECT EmployeeID,
EmployeeName,
ManagerID,
0 AS EmployeeLevel
FROM Employee
WHERE ManagerID IS NULL
UNION ALL
SELECT emp.EmployeeID,
emp.EmployeeName,
emp.ManagerID,
mgr.EmployeeLevel + 1 AS EmployeeLevel
FROM Employee AS emp
INNER JOIN EmployeeManagerCTE AS mgr
ON emp.ManagerID = mgr.EmployeeID
)
SELECT *
FROM EmployeeManagerCTE
ORDER BY EmployeeLevel;
Interpreting Results
The output will provide:
- Top-level bosses (
EmployeeLevel = 0
). - Direct reports to them (
EmployeeLevel = 1
). - Employees reporting to Level 1 managers (
EmployeeLevel = 2
). - And so on.
For instance:
EmployeeID | EmployeeName | ManagerID | EmployeeLevel |
---|---|---|---|
7 | Jenny | NULL | 0 |
5 | Samuel | 7 | 1 |
6 | Mark | 7 | 1 |
4 | Richard | 6 | 2 |
2 | Hillary | 5 | 2 |
3 | Adam | 4 | 3 |
Understanding the Recursion
Recursive queries essentially work in rounds:
- First Round: Finds Level 0 (top managers).
- Second Round: Looks for employees reporting to Level 0 and assigns them Level 1.
- Third Round: Fetches employees reporting to Level 1 and assigns them Level 2.
- And So On: Continues until no more matches are found.
This looping mechanism ensures the hierarchy is thoroughly explored without additional logic.
Why Recursive Queries Are Powerful
A self-join is perfect for a single layer of reporting. However, for multi-level hierarchies like family trees or nested relationships, recursion is your go-to tool. Adding hierarchy levels dynamically without manual effort saves time and makes your queries scalable.
Conclusion
With the recursive query approach, you can efficiently map out an entire employee-manager hierarchy. Whether it’s for organizational analysis or preparing for SQL interviews, mastering recursive queries is a valuable skill.
Learn More
For any questions, reach out to learn@knowstar.org.
Consider expanding your expertise with these certifications:
- Google Data Analytics Professional Certificate: https://imp.i384100.net/OR37oQ
- Google Advanced Data Analytics Professional Certificate: https://imp.i384100.net/eK1WmQ
- Best SQL and Data Analytics Books
- T-SQL Fundamentals (By Itzik Ben-Gan) - https://amzn.to/4koKGdC
- Ace the Data Science Interview - https://amzn.to/3D2ne5n