|
jbasso
Starting Member
USA
2 Posts |
Posted - 06/21/2012 : 18:26:00
|
I am trying to clean up a table that has 500,000 entries. The table consists of data where the entries were saved via an import program written in C++ which I have no control over.
The problem is the program is creating almost duplicate entries when the name or address field are not exactly like a current entry.
Sample: ID SubID Name Address 12345 X001 Green, Robert Jr 123 N. First Street 12345 X002 Green, Robert Jr. 123 N. First Street 12345 X003 Green, Robert Jr. 123 N 1st Street 12345 X004 Green, Robert 456 S. 3rd Street 12345 X005 Green, Janice 456 S. 3rd Street
It has to do with how the data is entered in the file that is imported which again I have no control over.
Not sure if I posted in the wrong area or if there just hasn't been anyone that can answer this yet so I re-posting it here.
What is my best option to report/cleanup so a query would return the truly unique records, such as:
ID SubID Name Address 12345 X001 Green, Robert Jr 123 N. First Street 12345 X004 Green, Robert 456 S. 3rd Street 12345 X005 Green, Janice 456 S. 3rd Street
Any one of the 3 in the sample that are matches would be acceptable to return.
Thanks in advance for any help. |
|