There are scenarios where we need to identify continuous streaks of days. For ex - we might be interested to find sales for 'n' consecutive days or trainees who attended for at least n days continuously.
There are two approaches we are going to discuss for this scenario. The first scenario is discussed in the below video tutorial on our YouTube channel. It makes use of the Row_Number and the Min , Max functions.
The second approach is using the Lag Aggregate Function.
The detailed explanation is provided in our latest video tutorial here-
We are going to be working on Employee Attendance data and identify employees who attended for at least 3 days continuously.
We are also going to exclude the weekends for the above scenario.
Below statements can be used to create the practice table and data.
CREATE TABLE [dbo].[Attendance](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Employeeid] [int] NULL,
[LoginDt] [date] NULL
) ;
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (1, 1, '2022-09-01');
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (2, 2, '2022-09-01');
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (3, 3, '2022-09-01');
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (4, 1, '2022-09-02');
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (5, 2, '2022-09-02');
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (6, 3, '2022-09-02');
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (7, 1, '2022-09-05');
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (8, 2, '2022-09-05');
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (9, 1, '2022-09-06');
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (10, 2, '2022-09-06');
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (11, 3, '2022-09-06');
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (12, 1, '2022-09-07');
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (13, 2, '2022-09-07');
INSERT [dbo].[Attendance] ([Id], [Employeeid], [LoginDt]) VALUES (14, 3, '2022-09-07');
SQL Query -
Step 1 - Identify previous date (current row date - 1 day) and Lag Date ( Date on the previous attendance record for the employee when the dataset is sorted by the Login Date.
Select EmployeeID,
LoginDt, DateAdd(d, -1, LoginDt) as PrevDt,
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt) as LagDt ,
FROM Attendance
Step 2 -
Make a comparison using Case statement to set the Consecutive Flag to 0 in case of continuous date records else 1.
Select EmployeeID,
LoginDt, DateAdd(d, -1, LoginDt) as PrevDt,
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt) as LagDt ,
Case When DateAdd(d, -1, LoginDt) =
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt)
Then 0 Else 1 END as Consecutive
FROM Attendance
Step 3 -
Identify unique streaks / batches for each employee. Sum up the Consecutive Flag from above CTE. As the consecutive flag is set to 1 for each new streak, each consecutive batch / streak number will be incremented by 1 (for each employee).
With ConsecutiveCTE as
(Select EmployeeID,
LoginDt, DateAdd(d, -1, LoginDt) as PrevDt,
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt) as LagDt ,
Case When DateAdd(d, -1, LoginDt) =
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt)
Then 0 Else 1 END as Consecutive
FROM Attendance)
Select *, SUM(Consecutive) OVER(Partition by EmployeeID Order by LoginDt) as BatchNo
FROM ConsecutiveCTE
Step 4 -
Use the Row_Number function to generate a rank for each consecutive day of the batch. It will help us calculate the total number of consecutive days in each batch (streak length).
With ConsecutiveCTE as
(Select EmployeeID,
LoginDt, DateAdd(d, -1, LoginDt) as PrevDt,
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt) as LagDt ,
Case When DateAdd(d, -1, LoginDt) =
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt)
Then 0 Else 1 END as Consecutive
FROM Attendance)
, BatchCTE as
(Select *, SUM(Consecutive) OVER(Partition by EmployeeID Order by LoginDt) as BatchNo
FROM ConsecutiveCTE)
Select *,
Row_Number() OVER (Partition by EmployeeID, BatchNo Order by LoginDt ) as BatchCnt
From BatchCTE
Step 5 - Reverse the sort order on LoginDt so the maximum numbered row is generated on the record with Consecutive flag set to 1.
With ConsecutiveCTE as
(Select EmployeeID,
LoginDt, DateAdd(d, -1, LoginDt) as PrevDt,
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt) as LagDt ,
Case When DateAdd(d, -1, LoginDt) =
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt)
Then 0 Else 1 END as Consecutive
FROM Attendance)
, BatchCTE as
(Select *, SUM(Consecutive) OVER(Partition by EmployeeID Order by LoginDt) as BatchNo
FROM ConsecutiveCTE)
Select *,
Row_Number() OVER (Partition by EmployeeID, BatchNo Order by LoginDt Desc ) as BatchCnt
From BatchCTE
Step 6 - Filter on the Consecutive Flag and Batch Count to get the required output.
With ConsecutiveCTE as
(Select EmployeeID,
LoginDt, DateAdd(d, -1, LoginDt) as PrevDt,
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt) as LagDt ,
Case When DateAdd(d, -1, LoginDt) =
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt)
Then 0 Else 1 END as Consecutive
FROM Attendance)
, BatchCTE as
(Select *, SUM(Consecutive) OVER(Partition by EmployeeID Order by LoginDt) as BatchNo
FROM ConsecutiveCTE)
, BatchCntCTE as
(Select *,
Row_Number() OVER (Partition by EmployeeID, BatchNo Order by LoginDt Desc) as BatchCnt
From BatchCTE)
Select *
from BatchCntCTE
WHERE Consecutive = 1 and BatchCnt >= 3
Excluding Weekends -
Add a condition to the Case statement to check if the date of the current LoginDt is a Monday and the LagDt is 3 days before it (previous Friday), then mark the record as Consecutive.
The Final Query is as below -
With ConsecutiveCTE as
(Select EmployeeID,
LoginDt, DateAdd(d, -1, LoginDt) as PrevDt,
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt) as LagDt ,
Case When DateAdd(d, -1, LoginDt) =
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt)
OR datename(dw, LoginDt) = 'Monday' and DateAdd(d, -3, LoginDt) =
LAG(LoginDt) OVER (Partition by EmployeeID Order by LoginDt)
Then 0 Else 1 END as Consecutive
FROM Attendance)
, BatchCTE as
(Select *, SUM(Consecutive) OVER(Partition by EmployeeID Order by LoginDt) as BatchNo
FROM ConsecutiveCTE)
, BatchCntCTE as
(Select *,
Row_Number() OVER (Partition by EmployeeID, BatchNo Order by LoginDt Desc) as BatchCnt
From BatchCTE)
Select *
from BatchCntCTE
WHERE Consecutive = 1 and BatchCnt >= 3