Freeware database cleanser

hirop

GIGO
Jun 24, 2009
58
1
0
Figured I'd share because it's saved me some time working out how to merge three inaccurate databases with a total of ~300k rows.

SQL Power - DQguru Data Cleansing & MDM Tool

Long story short, I've got a few datafeeds that have duplicate UPCs, and wrong UPCs. DQguru makes it very easy to strain out the dupes. Working on a workflow that'll help me correct them, hopefully by fuzzy-comparing (upc, product name) pairs with a known good db of (upc, product name). First I need to find the latter.. or maybe build it via the Amazon API.

PS: Anyone ever build a 100k+ page WordPress site? I never expected that the permalink to postid translation would be a bottleneck. Any other gotchas to look out for?
 


So it's for people who can't write a SQL statement...

SQL statements can't tell me which of the 10 products with the same UPC is the correct one. This thing makes it a lot faster to manually review and blam the inaccurate ones. I was going to write a python script to do it, then I found this. Regexp munging is handy too.