Home »
SQL
SQL query to delete duplicate records
In this article, we are going to learn about duplicate records and the process to find and delete them from the database.
Submitted by Manu Jemini, on March 11, 2018
If the database is not managed properly, then the records in the table can be duplicated, there is no doubt about it. Therefore, the need to delete the records arises, which can be little messy.
Let’s understand what it takes to delete the duplicate data of a table. The First we have to do is to find out the fields that can have the duplicate records. What it means is that, if we have an idea about which fields contain the duplicate data we can work on that more easily.
After that, all we need to do is to select the distinct data. In the example given below, the duplicate items have been selected with the help of id. What it means is that we group records according to the name and then delete them.
Now, deleting duplicate records is not a good thing for the database as we don’t know if the id of any of the records were used in some other record. This can cause an enormous amount of pressure if the loose track of any record which has been used somewhere else. Therefore always take care of these things beforehand.
Table (employee) having three fields and four data in it,
Id Name Address
100 Aman Mumbai
200 Arun Pune
300 Karan Delhi
400 Aman Mumbai
In above table, we have a duplicate entry of name "Aman" where the email is also same, now we are going to create a query to delete the duplicate row in the database and then select all records to show them to find out the update work properly.
Query:
DELETE FROM employee
WHERE id not in
(select min(id) FROM employee GROUP BY name, address);
Display table records.
SELECT * FROM employee;
Output: