All About Programming: MySQL Handling Duplicates

Use INSERT IGNORE rather than INSERT. If a record doesn't duplicate an existing record, MySQL inserts it as usual. If the record is a duplicate, the IGNORE keyword tells MySQL to discard it silently without generating an error.

Use REPLACE rather than INSERT. If the record is new, it's inserted just as with INSERT. If it's a duplicate, the new record replaces the old one:

Counting and Identifying Duplicates:
mysql> SELECT COUNT(*) as repetitions, last_name, first_name
-> FROM person_tbl
-> GROUP BY last_name, first_name
-> HAVING repetitions > 1;
Eliminating Duplicates from a Query Result:
SELECT DISTINCT last_name, first_name
-> FROM person_tbl
-> ORDER BY last_name;

SELECT last_name, first_name
-> FROM person_tbl
-> GROUP BY (last_name, first_name);
Removing Duplicates Using Table Replacement

mysql> CREATE TABLE tmp SELECT last_name, first_name, sex
    ->                  FROM person_tbl;
    ->                  GROUP BY (last_name, first_name);
mysql> DROP TABLE person_tbl;
mysql> ALTER TABLE tmp RENAME TO person_tbl;

An easy way of removing duplicate records from a table is to add an INDEX or PRIMAY KEY to that table. Even if this table is already available, you can use this technique to remove duplicate records and you will be safe in future as well.

mysql> ALTER IGNORE TABLE person_tbl
    -> ADD PRIMARY KEY (last_name, first_name);

Read full article from MySQL Handling Duplicates

MySQL Handling Duplicates

No comments:

Post a Comment

Labels

Popular Posts