Removing Duplicate Rows in MySQL
I know you, programmer. Sometimes even you- the most gifted of all programmers to ever place their hands on a keyboard- makes a mistake. Maybe you have a script that inserts into a database and that script steps on itself and for some unexpected reason your mechanism for making sure that doesn't happen fails. So what do you do when you are left with thousands of duplicate rows in a MySQL database?
Well, you could hire a college intern or someone from Manpower to manually delete all the rows. About a thousand dollars and a week later, you realize that they too can make mistakes, and about twenty important records are missing from your database. Then your boss comes to you and says, "So, Mr. Genius programmer, where did so-and-so's personal account information go?" Damn you Manpower!
That said, here's a quick and easy way to remove duplicates from a MySQL database. Make sure you execute it all at once, and make sure you backup your duplicated garbage table before hand.
INSERT INTO tblTempDoopless(pkID, strField1, strField2)
SELECT pkID, strField1, strField2
FROM tblWithDoops
GROUP BY strField1; /* this is the field that is duplicated! */
DELETE FROM tblWithDoops;
INSERT INTO tblWithDoops(pkID, strField1, strField2)
SELECT pkID, strField1, strField2
FROM tblTempDoopless;
Also, note the group by. Weird how that works. As long as you're not using concat, all the other columns seem to be the first row MySQL finds.