It is the most commonly asked SQL Interview Question.
It is because it tests two key skills for framing SQL Complex queries - Subquery and CTEs.
In this tutorial, we are going to discuss two methods to delete duplicates from a table.
The detailed video tutorial is available here -
First, lets create the sample data which we will use for this example.
The below SQL statements will create he dbo.Employee table with duplicates in columns FirstName + LastName.
CREATE TABLE dbo.Employee
(
EmployeeID INT ,
FirstName varchar(50) ,
LastName varchar(50) ,
Phone varchar(20) ,
Email varchar(50)
);
INSERT INTO dbo.Employee VALUES
(1, 'Adam', 'Owens', '444345999' , 'adam@demo.com'),
(2, 'Mark', 'Wilis', '245666921' , 'mark@demo.com'),
(3, 'Natasha', 'Lee', '321888909' , 'natasha@demo.com'),
(4, 'Adam', 'Owens', '444345999' , 'adam@demo.com'),
(5, 'Riley', 'Jones', '123345959' , 'riley@demo.com'),
(6, 'Natasha', 'Lee', '321888909' , 'natasha@demo.com');
Solution -
1) Using a Sub-Query
Hints - a) Group by data in the table by FirstName and LastName
b) Choose only the max EmployeeID from that group.
For ex - In case of Adam Owens, there are two records with Employee IDs 1 and 4.
Our query will choose only the record with EmployeeID = 4
In case, there is only a single record in the table, the same record will be fetched in the output.
SQL Query -
Delete from dbo.Employee
WHERE EmployeeID NOT IN
(Select Max(EmployeeID) from dbo.Employee
GROUP BY FirstName, LastName);
2) Using CTE
Hints - a) Create a CTE and use the RANK function to order the records for each Employee (First Name and Last Name) by EmployeeID.
b) Delete all records from the above CTE for which RANK > 1 (duplicate records)
SQL Query -
With Employee_CTE as
(Select * ,
RANK() OVER (Partition by FirstName, LastName ORDER BY EmployeeID desc ) as Rank
FROM dbo.Employee)
Delete from Employee_CTE
WHERE Rank > 1
This was really helpful
ReplyDeleterank and dense rank will not work if duplicate records are 1>
ReplyDeleteselect employeeid,firstname,lastname,phone,email from Employee group by firstname;
ReplyDeleteThis is very help ful for under stand the way you are teaching thanks a lot...
ReplyDeletePlease keep it up...
Great one. Thanks for that. But my opinion is 2nd one will work but wont have any impact in table
ReplyDelete