Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Processing duplicate addresses

Author  Topic 

valedaw2
Yak Posting Veteran

50 Posts

Posted - 2008-12-03 : 16:40:37
Hi,
I have a table with lots of addresses and some are obvious duplicates, but with differences, i.e.

1486 SOUTH PAGOSA STREET
1486 S. Pagosa St.
1486 So. Pagosa Street

Is there any method that can be used to know that these are duplicates?

jholovacs
Posting Yak Master

163 Posts

Posted - 2008-12-03 : 17:02:19
eh... SOUNDEX is the only thing that comes to mind, but it wouldn't work in that situation. You could UPDATE records that have the abbreviations to the full name, like St. becomes Street, S. becomes South, etc... a slow operation, but it may be the only avenue (pardon the pun) for you.



SELECT TOP 1
w.[name]
FROM
dbo.women w
INNER JOIN
dbo.inlaws i
ON
i.inlaw_id = w.parent_id
WHERE
i.net_worth > 10000000
AND
i.status IN ('dead', 'dying')
AND
w.husband_id IS NULL
ORDER BY
w.hotness_factor DESC
Go to Top of Page

DavidChel
Constraint Violating Yak Guru

474 Posts

Posted - 2008-12-03 : 17:08:25
Any chance you could convert the others to a uniform format with Update statements?
Go to Top of Page

valedaw2
Yak Posting Veteran

50 Posts

Posted - 2008-12-04 : 08:21:50
I was wondering if hashing or parsing would help? Not quite sure how to use them though.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-12-04 : 08:25:25
seems like this would help
http://sqlblindman.googlepages.com/fuzzysearchalgorithm
Go to Top of Page
   

- Advertisement -