SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 compare 2 address fields in two tables
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

cardgunner
Constraint Violating Yak Guru

USA
324 Posts

Posted - 06/21/2007 :  15:52:32  Show Profile  Reply with Quote
I have two Tables and I need to find which ones have the same address. And from those which are the same company.

First table is QRY9962 and the Other is COM130
I can't change the information in COM130 or any of it's linked tables because I don't have that authority. However the data in COM130 is terrible as far as addresses go.

So I need to replace the mistyped info with the correct info and then comapre that to the info in QRY9962.

Once I have established that I have a match I need to update QRY9962 with it's BPID. That I do have authority to do.

Here is my mess:

SELECT 	EDACAP.BUYID,
	EXT.BPID,
	convert(varchar(30),EDACAP.BUYCOMP) as 'BUYCOMP',
	Convert(varchar(30),EXT.BPNM) as 'BPNM',
	Convert(varchar(25),EXT.ADD1) as 'ADD1',
	EDACAP.BUYPHONE,
	EXT.PHNE,
	EXT.ZIP
from
(select 	distinct
	BUYID AS 'BUYID',
	ISNULL(BUYCOMP1, BUYC1FIRST + ' ' + BUYC1LAST) AS 'BUYCOMP',
	upper(REPLACE(REPLACE(REPLACE(Replace(replace(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(Replace(Replace(Replace(BUYADR1,'P.O. ','PO '),'P.O.BOX','P.O. BOX'),'P. O. ','PO '),'P.O.','PO'),'P O ','PO '),'ROAD','RD'),'STREET','ST'), 'DRIVE','DR'), 'AVENUE', 'AVE'),'AVE.','AVE'),
	'DR.','DR'),'RD.','RD'),'ST.','ST'))AS 'ADR1',
	BUYZIP AS 'BUYZIP',
	BUYPHONE AS 'BUYPHONE'
 	FROM QRY9962)EDACAP
JOIN 
(SELECT	REPLACE(REPLACE(REPLACE(Replace(replace(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(Replace(Replace(Replace(130.namc,'P.O. ','PO '),'P.O.BOX','P.O. BOX'),'P. O. ','PO '),'P.O.','PO'),'P O ','PO '),'ROAD','RD'),'STREET','ST'), 'DRIVE','DR'), 'AVENUE', 'AVE'),'AVE.','AVE'),
	'DR.','DR'),'RD.','RD'),'ST.','ST') as 'ADD1',
	130.pstc as 'ZIP',
	112.itbp as 'BPID',
	100.nama as 'BPNM',
	130.telp as 'PHNE'
	from COM100 as 100
	join COM130 as 130
	on 100.cadr=130.cadr
	join COM112 as 112
	on 100.bpid=112.itbp)EXT
ON 	upper(REPLACE(REPLACE(REPLACE(Replace(replace(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(Replace(Replace(Replace(ADR1,'P.O. ','PO '),'P.O.BOX','P.O. BOX'),'P. O. ','PO '),'P.O.','PO'),'P O ','PO '),'ROAD','RD'),'STREET','ST'), 'DRIVE','DR'), 'AVENUE', 'AVE'),'AVE.','AVE'),
	'DR.','DR'),'RD.','RD'),'ST.','ST'))=REPLACE(REPLACE(REPLACE(Replace(replace(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(Replace(Replace(Replace(EXT.ADD1,'P.O. ','PO '),'P.O.BOX','P.O. BOX'),'P. O. ','PO '),'P.O.','PO'),'P O ','PO '),'ROAD','RD'),'STREET','ST'), 'DRIVE','DR'), 'AVENUE', 'AVE'),'AVE.','AVE'),
	'DR.','DR'),'RD.','RD'),'ST.','ST')AND
	EDACAP.BUYZIP=EXT.ZIP
ORDER BY EXT.ADD1

No it takes about 2 1/2 minutes to rin and I get back about 1200 records. And I have been able to update a record at a time.

update QRY9962
set BPID='######'
where BUYID="&&&&&&'

This will take me too long.

Q1) Is there a better way write the SQL comparing the tables?

Q2) Once I have established the best SQL how do I incorporate those results as my criteria for the update.

Where as I can do small chunks and say from the results of QRYvsCOM between 1 and 200 and not in ('@@@', '###','%%%','***')update QRY9962 set QRY9962.BPID=COM130.BPID

I wish I knew how to present my questions better. I apologize if it's not correct language.



Card Gunner

nr
SQLTeam MVY

United Kingdom
12543 Posts

Posted - 06/22/2007 :  05:06:03  Show Profile  Visit nr's Homepage  Reply with Quote
You would be better off creating temp tables of the addresses and manipulating them into a standard format there. Do the matching, get the set of IDs (and duplicates) into a table then do the update from that table.
I assume you have a small clustered inde on the table you are updating so use that in the temp tables to identify the rows.
Also consider keeping the match in a different table so that you don't update any of the tables but delete and insert rows into this conjoint table.
Are you matching customers or addresses? If it's customers then you probably should take into account historic addresses i.e. keep the match when they move address.

Just finished my 4th address matching routine for different companies and they have all had different requirements.
In the UK we have a PAF database which gives most of the addresses in the UK and it is best to match to that - don't know if you have something simiar in the US if that's where you are.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

cardgunner
Constraint Violating Yak Guru

USA
324 Posts

Posted - 06/22/2007 :  09:10:33  Show Profile  Reply with Quote
Well I don't know how to create a temp table but I'm willing to learn. I looked up temporary table on BOL and got this

When using bcp or BULK INSERT to bulk copy data using a global temporary table, the table name must be specified at the command prompt, including initial number signs (##). For example, to bulk copy data from the global temporary table ##temp_authors to the Temp_authors.txt data file, execute at the command prompt:

bcp ##temp_authors out temp_authors.txt -c -Sservername -Usa -Ppassword

However, do not specify the database name when using global temporary tables because temporary tables exist only in tempdb. It is possible to use a local temporary table (for example, #temp_authors) only when bulk copying data using the BULK INSERT statement.

That doen't tell me a whole lot on how to set opne up. I'm searching the forum to see if I can get bits and pieces.

Is what you are saying is Create 3 temp tables
Temp1 is COM130 with all the addresses correct
temp2 is QRY with all of the addresses correct
temp3 is Temp 1 and Temp 2 where the adresses match
Then update the BPID from Temp3 to QRY where a BPID is not null or something like that.

Seems like it will work. I use to do the something similar in MS Access but it took all day to do.

I will look to see if I can figure out the temp table thing. Thanks.

Card Gunner

Edited by - cardgunner on 06/22/2007 09:17:23
Go to Top of Page

cardgunner
Constraint Violating Yak Guru

USA
324 Posts

Posted - 06/22/2007 :  09:16:52  Show Profile  Reply with Quote
Also,

After comparing ADD1, I need to compare ADR2, and then phone and then name. I'm assuming that I would make any or all necessary corrections to those fields when I create Temp1.

Which then I'm going to have to do an append to Temp3???? for the address2 matches, for the phone matches, for the name matches,

This is where I'm going to get lost. Actuaaly I should have got lost when i "volunteered" for this job.

Card Gunner
Go to Top of Page

blindman
Flowing Fount of Yak Knowledge

USA
2365 Posts

Posted - 06/22/2007 :  10:35:51  Show Profile  Reply with Quote
Can you show us some sample records you are trying to match? A fuzzy logic algorithm may be appropriate.

e4 d5 xd5 Nf6
Go to Top of Page

nr
SQLTeam MVY

United Kingdom
12543 Posts

Posted - 06/22/2007 :  11:07:24  Show Profile  Visit nr's Homepage  Reply with Quote
consider this

flat 10|johnson court|43 mullholland drive|mytown|mycity
10|johnson court|43 mullholland drive|mytown|mycity
10|43 mullholland drive|mytown|mycity
10 johnson court|43 mullholland drive|mytown|mycity
flt 10|johnson court|43 mullholland drive|mytown|mycity
flt 10|johnson ct|43 mullholland drive|mytown|mycity
flt 10|johnson ct|43 mullholland drv|mytown|mycity
10 johnson ct|43 mullholland drv|mytown
10/43 mullholland drv|mytown

If you are going to match on phone numbers then be careful about communal phones and mobiles/land lines.

It's not a simple job and it's neverending - you will have to look at the data and find things that should match and cater for them. Get the business to tell you the things they want to include as a match then go back to them and show what else will match given that criteria.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

cardgunner
Constraint Violating Yak Guru

USA
324 Posts

Posted - 06/22/2007 :  12:10:02  Show Profile  Reply with Quote
You are correct nr in your examples.

I have only one suggestion and I'm trying to get the right syntax and that is to create temp tables. This will improve the performance and give me an abilty to update with a simple where statement.

However up to today I haven't ever used a temp table and I'm learning. I think I got it. And in theory it should uncomplicate things, I hope. Be this it will also speed up the performance. My dilema in the past when I was using MS Access was this was a 10 step process. I wasn't schooled in that either but I made do.

Card Gunner
Go to Top of Page

blindman
Flowing Fount of Yak Knowledge

USA
2365 Posts

Posted - 06/22/2007 :  12:28:08  Show Profile  Reply with Quote
This function may be useful in identifying matches:
http://sqlblindman.googlepages.com/fuzzysearchalgorithm

For example:
declare	@TempTable table (AddressString varchar(60))

insert into @TempTable
	(AddressString)
select	'flat 10' + ' johnson court' + ' 43 mullholland drive' + ' mytown' + ' mycity'
Union select '10' + ' johnson court' + ' 43 mullholland drive' + ' mytown' + ' mycity'
Union select '10' + ' 43 mullholland drive' + ' mytown' + ' mycity'
Union select '42 west foobar' + ' nowheresville'
Union select '116 grandview ct, elletesville'
Union select '1234 cold ave, gnome'
Union select '65 bliss rd, pleasantville'
Union select '42 w, foo-bar' + ' nowheresville'
Union select '116 grandvue court, elletesville'
Union select '1234 cold avenue, nome'
Union select '65 bliss roadd, pleasantville'

select	T1.AddressString,
	T2.AddressString,
	dbo.CompareText(T1.AddressString, T2.AddressString) as Similarity
from	@TempTable T1
	inner join @TempTable T2 on T1.AddressString < T2.AddressString
where	dbo.CompareText(T1.AddressString, T2.AddressString) > 50
order by T1.AddressString,
	Similarity desc

Results:
AddressString                                                AddressString                                                Similarity  
------------------------------------------------------------ ------------------------------------------------------------ ----------- 
10 43 mullholland drive mytown mycity                        10 johnson court 43 mullholland drive mytown mycity          84
10 43 mullholland drive mytown mycity                        flat 10 johnson court 43 mullholland drive mytown mycity     81
10 johnson court 43 mullholland drive mytown mycity          flat 10 johnson court 43 mullholland drive mytown mycity     97
116 grandview ct, elletesville                               116 grandvue court, elletesville                             81
1234 cold ave, gnome                                         1234 cold avenue, nome                                       85
42 w, foo-bar nowheresville                                  42 west foobar nowheresville                                 84
65 bliss rd, pleasantville                                   65 bliss roadd, pleasantville                                90


e4 d5 xd5 Nf6
Go to Top of Page

cardgunner
Constraint Violating Yak Guru

USA
324 Posts

Posted - 06/22/2007 :  12:37:13  Show Profile  Reply with Quote
I'm not sure what you wrote, so I doubt if it would help. I'm sorry.

However does replace(ADR1,'P.O.', 'PO') work in a temp table?

Card Gunner
Go to Top of Page

cardgunner
Constraint Violating Yak Guru

USA
324 Posts

Posted - 06/22/2007 :  12:49:41  Show Profile  Reply with Quote
Forget it. I was asking upper(replace(add1, ROAD, RD)) and the info in the table was 1 Ferry Road. I thought it converted to upper bewfore it looked. I wonder replace(upper(addd1). Yep that worked.

Card Gunner
Go to Top of Page

blindman
Flowing Fount of Yak Knowledge

USA
2365 Posts

Posted - 06/22/2007 :  13:40:30  Show Profile  Reply with Quote
quote:
Originally posted by cardgunner

I'm not sure what you wrote, so I doubt if it would help. I'm sorry.

Then it probably wouldn't.

e4 d5 xd5 Nf6
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.12 seconds. Powered By: Snitz Forums 2000