Home »
SQL
SQL query to find duplicate records
In this article, we are going to learn about to find duplicate records in database using SQL Query and then create 2 to 3 query to take out the duplicate record and resolve the problem.
Submitted by Manu Jemini, on March 11, 2018
A Table in a relational-database can have multiple fields for example id, name and address. Now, there are cases when we want to have one or more fields which can be duplicated.
This can be understood as the name of person or address of two persons having same or different name in the table. Now, finding these records can be a little tricky as we don’t have any key which can be worked for all the records in the table.
SQL language gives certain functions which should be used when doing these sorts of operations. One of them is count, which is used to count the number of records which are selected by the query.
Having said that, one problem arises that how can we use count to differentiate between duplicate and unique records. Well the answer is very simple, all we need to do is to group them by there corresponding fields which are prone to be duplicated.
Now in the end we will have number of records which are arranged with their corresponding field or fields. Then, the last thing we have to do is to filter out the records which have there counter smaller or equal to zero.
In the end, we will be left with the record whose field will be duplicated.
Table (employee) having three fields and four data in it.
Id Name Address
100 Aman Mumbai
200 Arun Pune
300 Karan Delhi
400 Aman Mumbai
Here, in above table we have a duplicate entry of name "Aman" where the email is also same, now we need to create a query to find the duplicate field in database.
Query 1:
SELECT name, email, COUNT(name)
FROM employee
GROUP BY name, email
HAVING COUNT(name)>1;
Output:
Query 2:
SELECT name, COUNT(name)
FROM employee
GROUP BY name, address
HAVING COUNT(name)>1;
Output:
Query 3:
SELECT address, COUNT(address)
FROM employee
GROUP BY name, address
HAVING COUNT(address)>1;
Output: