Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 compare 2 address fields in two tables

Author  Topic 

cardgunner

326 Posts

Posted - 2007-06-21 : 15:52:32
I have two Tables and I need to find which ones have the same address. And from those which are the same company.

First table is QRY9962 and the Other is COM130
I can't change the information in COM130 or any of it's linked tables because I don't have that authority. However the data in COM130 is terrible as far as addresses go.

So I need to replace the mistyped info with the correct info and then comapre that to the info in QRY9962.

Once I have established that I have a match I need to update QRY9962 with it's BPID. That I do have authority to do.

Here is my mess:

SELECT EDACAP.BUYID,
EXT.BPID,
convert(varchar(30),EDACAP.BUYCOMP) as 'BUYCOMP',
Convert(varchar(30),EXT.BPNM) as 'BPNM',
Convert(varchar(25),EXT.ADD1) as 'ADD1',
EDACAP.BUYPHONE,
EXT.PHNE,
EXT.ZIP
from
(select distinct
BUYID AS 'BUYID',
ISNULL(BUYCOMP1, BUYC1FIRST + ' ' + BUYC1LAST) AS 'BUYCOMP',
upper(REPLACE(REPLACE(REPLACE(Replace(replace(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(Replace(Replace(Replace(BUYADR1,'P.O. ','PO '),'P.O.BOX','P.O. BOX'),'P. O. ','PO '),'P.O.','PO'),'P O ','PO '),'ROAD','RD'),'STREET','ST'), 'DRIVE','DR'), 'AVENUE', 'AVE'),'AVE.','AVE'),
'DR.','DR'),'RD.','RD'),'ST.','ST'))AS 'ADR1',
BUYZIP AS 'BUYZIP',
BUYPHONE AS 'BUYPHONE'
FROM QRY9962)EDACAP
JOIN
(SELECT REPLACE(REPLACE(REPLACE(Replace(replace(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(Replace(Replace(Replace(130.namc,'P.O. ','PO '),'P.O.BOX','P.O. BOX'),'P. O. ','PO '),'P.O.','PO'),'P O ','PO '),'ROAD','RD'),'STREET','ST'), 'DRIVE','DR'), 'AVENUE', 'AVE'),'AVE.','AVE'),
'DR.','DR'),'RD.','RD'),'ST.','ST') as 'ADD1',
130.pstc as 'ZIP',
112.itbp as 'BPID',
100.nama as 'BPNM',
130.telp as 'PHNE'
from COM100 as 100
join COM130 as 130
on 100.cadr=130.cadr
join COM112 as 112
on 100.bpid=112.itbp)EXT
ON upper(REPLACE(REPLACE(REPLACE(Replace(replace(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(Replace(Replace(Replace(ADR1,'P.O. ','PO '),'P.O.BOX','P.O. BOX'),'P. O. ','PO '),'P.O.','PO'),'P O ','PO '),'ROAD','RD'),'STREET','ST'), 'DRIVE','DR'), 'AVENUE', 'AVE'),'AVE.','AVE'),
'DR.','DR'),'RD.','RD'),'ST.','ST'))=REPLACE(REPLACE(REPLACE(Replace(replace(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(Replace(Replace(Replace(EXT.ADD1,'P.O. ','PO '),'P.O.BOX','P.O. BOX'),'P. O. ','PO '),'P.O.','PO'),'P O ','PO '),'ROAD','RD'),'STREET','ST'), 'DRIVE','DR'), 'AVENUE', 'AVE'),'AVE.','AVE'),
'DR.','DR'),'RD.','RD'),'ST.','ST')AND
EDACAP.BUYZIP=EXT.ZIP
ORDER BY EXT.ADD1

No it takes about 2 1/2 minutes to rin and I get back about 1200 records. And I have been able to update a record at a time.

update QRY9962
set BPID='######'
where BUYID="&&&&&&'

This will take me too long.

Q1) Is there a better way write the SQL comparing the tables?

Q2) Once I have established the best SQL how do I incorporate those results as my criteria for the update.

Where as I can do small chunks and say from the results of QRYvsCOM between 1 and 200 and not in ('@@@', '###','%%%','***')update QRY9962 set QRY9962.BPID=COM130.BPID

I wish I knew how to present my questions better. I apologize if it's not correct language.



Card Gunner

nr
SQLTeam MVY

12543 Posts

Posted - 2007-06-22 : 05:06:03
You would be better off creating temp tables of the addresses and manipulating them into a standard format there. Do the matching, get the set of IDs (and duplicates) into a table then do the update from that table.
I assume you have a small clustered inde on the table you are updating so use that in the temp tables to identify the rows.
Also consider keeping the match in a different table so that you don't update any of the tables but delete and insert rows into this conjoint table.
Are you matching customers or addresses? If it's customers then you probably should take into account historic addresses i.e. keep the match when they move address.

Just finished my 4th address matching routine for different companies and they have all had different requirements.
In the UK we have a PAF database which gives most of the addresses in the UK and it is best to match to that - don't know if you have something simiar in the US if that's where you are.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

cardgunner

326 Posts

Posted - 2007-06-22 : 09:10:33
Well I don't know how to create a temp table but I'm willing to learn. I looked up temporary table on BOL and got this

When using bcp or BULK INSERT to bulk copy data using a global temporary table, the table name must be specified at the command prompt, including initial number signs (##). For example, to bulk copy data from the global temporary table ##temp_authors to the Temp_authors.txt data file, execute at the command prompt:

bcp ##temp_authors out temp_authors.txt -c -Sservername -Usa -Ppassword

However, do not specify the database name when using global temporary tables because temporary tables exist only in tempdb. It is possible to use a local temporary table (for example, #temp_authors) only when bulk copying data using the BULK INSERT statement.

That doen't tell me a whole lot on how to set opne up. I'm searching the forum to see if I can get bits and pieces.

Is what you are saying is Create 3 temp tables
Temp1 is COM130 with all the addresses correct
temp2 is QRY with all of the addresses correct
temp3 is Temp 1 and Temp 2 where the adresses match
Then update the BPID from Temp3 to QRY where a BPID is not null or something like that.

Seems like it will work. I use to do the something similar in MS Access but it took all day to do.

I will look to see if I can figure out the temp table thing. Thanks.

Card Gunner
Go to Top of Page

cardgunner

326 Posts

Posted - 2007-06-22 : 09:16:52
Also,

After comparing ADD1, I need to compare ADR2, and then phone and then name. I'm assuming that I would make any or all necessary corrections to those fields when I create Temp1.

Which then I'm going to have to do an append to Temp3???? for the address2 matches, for the phone matches, for the name matches,

This is where I'm going to get lost. Actuaaly I should have got lost when i "volunteered" for this job.

Card Gunner
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-06-22 : 10:35:51
Can you show us some sample records you are trying to match? A fuzzy logic algorithm may be appropriate.

e4 d5 xd5 Nf6
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2007-06-22 : 11:07:24
consider this

flat 10|johnson court|43 mullholland drive|mytown|mycity
10|johnson court|43 mullholland drive|mytown|mycity
10|43 mullholland drive|mytown|mycity
10 johnson court|43 mullholland drive|mytown|mycity
flt 10|johnson court|43 mullholland drive|mytown|mycity
flt 10|johnson ct|43 mullholland drive|mytown|mycity
flt 10|johnson ct|43 mullholland drv|mytown|mycity
10 johnson ct|43 mullholland drv|mytown
10/43 mullholland drv|mytown

If you are going to match on phone numbers then be careful about communal phones and mobiles/land lines.

It's not a simple job and it's neverending - you will have to look at the data and find things that should match and cater for them. Get the business to tell you the things they want to include as a match then go back to them and show what else will match given that criteria.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

cardgunner

326 Posts

Posted - 2007-06-22 : 12:10:02
You are correct nr in your examples.

I have only one suggestion and I'm trying to get the right syntax and that is to create temp tables. This will improve the performance and give me an abilty to update with a simple where statement.

However up to today I haven't ever used a temp table and I'm learning. I think I got it. And in theory it should uncomplicate things, I hope. Be this it will also speed up the performance. My dilema in the past when I was using MS Access was this was a 10 step process. I wasn't schooled in that either but I made do.

Card Gunner
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-06-22 : 12:28:08
This function may be useful in identifying matches:
http://sqlblindman.googlepages.com/fuzzysearchalgorithm

For example:
declare	@TempTable table (AddressString varchar(60))

insert into @TempTable
(AddressString)
select 'flat 10' + ' johnson court' + ' 43 mullholland drive' + ' mytown' + ' mycity'
Union select '10' + ' johnson court' + ' 43 mullholland drive' + ' mytown' + ' mycity'
Union select '10' + ' 43 mullholland drive' + ' mytown' + ' mycity'
Union select '42 west foobar' + ' nowheresville'
Union select '116 grandview ct, elletesville'
Union select '1234 cold ave, gnome'
Union select '65 bliss rd, pleasantville'
Union select '42 w, foo-bar' + ' nowheresville'
Union select '116 grandvue court, elletesville'
Union select '1234 cold avenue, nome'
Union select '65 bliss roadd, pleasantville'

select T1.AddressString,
T2.AddressString,
dbo.CompareText(T1.AddressString, T2.AddressString) as Similarity
from @TempTable T1
inner join @TempTable T2 on T1.AddressString < T2.AddressString
where dbo.CompareText(T1.AddressString, T2.AddressString) > 50
order by T1.AddressString,
Similarity desc

Results:
AddressString                                                AddressString                                                Similarity  
------------------------------------------------------------ ------------------------------------------------------------ -----------
10 43 mullholland drive mytown mycity 10 johnson court 43 mullholland drive mytown mycity 84
10 43 mullholland drive mytown mycity flat 10 johnson court 43 mullholland drive mytown mycity 81
10 johnson court 43 mullholland drive mytown mycity flat 10 johnson court 43 mullholland drive mytown mycity 97
116 grandview ct, elletesville 116 grandvue court, elletesville 81
1234 cold ave, gnome 1234 cold avenue, nome 85
42 w, foo-bar nowheresville 42 west foobar nowheresville 84
65 bliss rd, pleasantville 65 bliss roadd, pleasantville 90


e4 d5 xd5 Nf6
Go to Top of Page

cardgunner

326 Posts

Posted - 2007-06-22 : 12:37:13
I'm not sure what you wrote, so I doubt if it would help. I'm sorry.

However does replace(ADR1,'P.O.', 'PO') work in a temp table?

Card Gunner
Go to Top of Page

cardgunner

326 Posts

Posted - 2007-06-22 : 12:49:41
Forget it. I was asking upper(replace(add1, ROAD, RD)) and the info in the table was 1 Ferry Road. I thought it converted to upper bewfore it looked. I wonder replace(upper(addd1). Yep that worked.

Card Gunner
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-06-22 : 13:40:30
quote:
Originally posted by cardgunner

I'm not sure what you wrote, so I doubt if it would help. I'm sorry.

Then it probably wouldn't.

e4 d5 xd5 Nf6
Go to Top of Page
   

- Advertisement -