Categories
Databases MySQL programming Quick Tips servers sql

Finding Duplicates using SQL

While migrating old data to a new database schema I was unable to activate a PRIMARY KEY on the legacy_customer_id field because there were duplicates. It turned out that the old application did not clean the data really well, and it allowed duplicates to be created where one of the customer_ids had a space character making it unique.

I used the following query to test for others:

SELECT
    customer_id,
    COUNT(customer_id)
FROM
    customers
GROUP BY
    customer_id
HAVING
    (COUNT(customer_id) > 1);

This allowed me to find all customer_ids that had duplicates and clean things up.

Categories
Databases MySQL Quick Tips sql

Cleaning duplicate records with SQL

I needed a way to clean out a MySQL table of records with duplicate email addresses. To do this I used a simple SELECT query inside of an INSERT.

INSERT INTO
    good_users(email)
        SELECT
            DISTINCT(email)
        FROM
            bad_users