Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
bwol
Starting Member
19 Posts |
Posted - 2008-06-04 : 14:04:50
|
| Good Afternoon,I am new to MSSQL and am trying to write a complicated SQL statement that I'm having trouble with. Any help that anyone can offer is much appreciated!Here is the problem I am tackling: I have a list of about 5,000 members of our organization stored in the MemberList MSSQL table. I have a separate MSSQL table (CityList) that has approximately 500,000 resident of a city. I am trying to find matches between MemberList and CityList for the purposes of figuring out which of our members are registered voters. The tricky part of this problem, is that there is no unique ID (such as a social security number) that is present in each list. Accordingly, I have decided to created several types of matches:1. NameDOBMatch: Where the FirstName, LastName & DOB fields in MemberList table match the same fields in CityList table.2. NameAddressMatch: Where the LastName, FirstName & Address fields in the MemberList table match the same fields in the CityList table.3. DoubleMatch: A combination of the first two matches (i.e. where the LastName, FirstName, DOB, & Address fields in the MemberList table match the same fields in the CityList table).My goal is to "loop" through the MemberList and CityList tables and to add a new row to a third MSSQL table (MemberMatch) each time one of the aforementioned matches is found. The MemberMatch table has the following fields: 1. MatchID (key)2. MemberID (Unique ID of member from MemberList table)3. ResidentID (Unqiue ID of matching member from CityList table).4. MatchType (value of NameDOBMatch, NameAddressMatch or DoubleMatch, depending on match type).If anyone could help me create an SQL statement that would accomplish that, I would very much appreciate it!Thanks,Bryan |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-06-04 : 14:22:58
|
something on the lines ofINSERT INTO MemberMatch (MemberID,ResidentID,MatchType)SELECT m.MemberID,c.ResidentID,CASE WHEN c.DOB=m.DOB AND c.Address<>m.Address THEN 'NameDOBMatch' WHEN c.Address=m.Address AND c.DOB<>m.DOB THEN 'NameAddressMatch' ELSE 'DoubleMatch'ENDFROM MemberList mINNER JOIN CityList cON m.FirstName=c.FirstNameAND m.LastName=c.LastNameWHERE c.DOB=m.DOBOR c.Address=m.Address |
 |
|
|
bwol
Starting Member
19 Posts |
Posted - 2008-06-04 : 15:17:10
|
| Thanks so much for the very quick and helpful reply!!I have one quick follow-up question. Given the large amount of data in my tables, I am interested in testing this script using only a small number of members (i.e. only a few rows from the MemberList table).How could I modify this SQL statement such that I could specify a few MemberIDs (the key for the MemberList table) and only run the script for those individuals. Thanks Again!-Bryan |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2008-06-04 : 15:18:55
|
| Add this condition to the WHERE clause.AND m.MemeberID IN ('user1', 'user2', user'3, ...)Tara KizerMicrosoft MVP for Windows Server System - SQL Serverhttp://weblogs.sqlteam.com/tarad/Database maintenance routines:http://weblogs.sqlteam.com/tarad/archive/2004/07/02/1705.aspx |
 |
|
|
bwol
Starting Member
19 Posts |
Posted - 2008-06-04 : 17:40:42
|
| Thanks again to both of you for your help. I created the SQL statement below based on your advice.You will notice a few changes. Specifically, some of the table and field names are changed since I tried to use intuitive table and field names in my first post (to avoid any confusion), instead of the actual names (which make sense to me but might be confusing to someone else). I also made one substantive change. I decided not to compare full addresses to full addresses, since the use of different spacing and abbreviations results in address that are not exactly the same (i.e. not identical strings), but actually are the same address (i.e. "123 Main Street, Apartment 4, Boston MA 02111" & "123 Main St. Apt#4, Boston MA, 02111).Accordingly, I decided that an address match occurs when the following three fields, which are included in both the BostonResidents table and the Members Table, all match: StreetName, StreetNumber and Zip.I have two questions about my updated procedure:1. Do the changes I have made look correct?2. The StreetName field in the Members table has values such as "Main" "Washington" & "South", while the StreetName field in the BostonResidents table has values such as "Main St", "Washington Ave" & "South Rd."It seems to me that in when I am comparing m.StreetName to b.StreetName, I want to be using "like" and "not like" (with a % character) instead of "=" and "<>." However, in the examples I have been able to find, the "like" condition is used to compare something to a string and not to a field. It is possible to use "like" to compare two field names, and if so how? Thanks again for the help!-Bryan Update ProcedureCREATE PROCEDURE dbo.MemberMatch AS INSERT INTO MemberBRLMatches (TIMSSID,ResidentID,MatchType)SELECT m.TIMSSID,b.ResidentID,CASE WHEN b.DOB=m.DOB AND (b.StreetNumber<>m.StreetNumber OR b.StreetName<>m.StreetName OR b.Zip<>m.Zip) THEN 'NameDOBMatch' WHEN (b.StreetNumber=m.StreetNumber AND b.StreetName=m.StreetName AND b.Zip=m.Zip) AND b.DOB<>m.DOB THEN 'NameAddressMatch' ELSE 'DoubleMatch'ENDFROM Members mINNER JOIN BostonResidents bON m.FirstName=b.FirstNameAND m.LastName=b.LastNameWHERE b.DOB=m.DOB AND m.TIMSSID IN ('BO017658', 'BO017658', 'BO017658')OR (b.StreetNumber=m.StreetNumber AND b.StreetName=m.StreetName AND b.Zip=m.Zip) |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-06-05 : 00:13:59
|
quote: Originally posted by bwol Thanks again to both of you for your help. I created the SQL statement below based on your advice.You will notice a few changes. Specifically, some of the table and field names are changed since I tried to use intuitive table and field names in my first post (to avoid any confusion), instead of the actual names (which make sense to me but might be confusing to someone else). I also made one substantive change. I decided not to compare full addresses to full addresses, since the use of different spacing and abbreviations results in address that are not exactly the same (i.e. not identical strings), but actually are the same address (i.e. "123 Main Street, Apartment 4, Boston MA 02111" & "123 Main St. Apt#4, Boston MA, 02111).Accordingly, I decided that an address match occurs when the following three fields, which are included in both the BostonResidents table and the Members Table, all match: StreetName, StreetNumber and Zip.I have two questions about my updated procedure:1. Do the changes I have made look correct?2. The StreetName field in the Members table has values such as "Main" "Washington" & "South", while the StreetName field in the BostonResidents table has values such as "Main St", "Washington Ave" & "South Rd."It seems to me that in when I am comparing m.StreetName to b.StreetName, I want to be using "like" and "not like" (with a % character) instead of "=" and "<>." However, in the examples I have been able to find, the "like" condition is used to compare something to a string and not to a field. It is possible to use "like" to compare two field names, and if so how? Thanks again for the help!-Bryan Update ProcedureCREATE PROCEDURE dbo.MemberMatch AS INSERT INTO MemberBRLMatches (TIMSSID,ResidentID,MatchType)SELECT m.TIMSSID,b.ResidentID,CASE WHEN b.DOB=m.DOB AND (b.StreetNumber<>m.StreetNumber OR b.StreetName<>m.StreetName OR b.Zip<>m.Zip) THEN 'NameDOBMatch' WHEN (b.StreetNumber=m.StreetNumber AND b.StreetName=m.StreetName AND b.Zip=m.Zip) AND b.DOB<>m.DOB THEN 'NameAddressMatch' ELSE 'DoubleMatch'ENDFROM Members mINNER JOIN BostonResidents bON m.FirstName=b.FirstNameAND m.LastName=b.LastNameWHERE b.DOB=m.DOB AND m.TIMSSID IN ('BO017658', 'BO017658', 'BO017658')OR (b.StreetNumber=m.StreetNumber AND b.StreetName=m.StreetName AND b.Zip=m.Zip)
It is possible. use b.StreetNumber NOT LIKE m.StreetNumber + '%' instead of b.StreetNumber<>m.StreetNumber and similarly for others |
 |
|
|
bwol
Starting Member
19 Posts |
Posted - 2008-06-12 : 16:11:18
|
| This query is giving me some trouble. I tried running the MemberMatch query (shown below) which I created based on the suggestions above. I confirmed ahead of time that at least one "match" should be found, and accordingly, that at least one row should be inserted into the MemberBRLMatches table. However, when I executed the query, no rows were inserted. For testing purposes, I then created the MemberMatch2 query, also shown below, with actual values in the place of all the database fields that needed to be matched. That query did not result in any rows being inserted into the MemberBRLMatches table either. If anyone has any ideas about why this query isn't behaving as expected, or any ideas about the best way to conduct further testing, I'd appreciate the help!CREATE PROCEDURE dbo.MemberMatch AS INSERT INTO MemberBRLMatches (TIMSSID,ResidentID,MatchType)SELECT m.TIMSSID,b.ResidentID,CASE WHEN b.DOB=m.DOB AND (b.StreetNumber<>m.StreetNumber OR (m.StreetName + '%' NOT LIKE b.StreetName) OR b.Zip<>m.Zip) THEN 'NameDOBMatch' WHEN (b.StreetNumber=m.StreetNumber AND (m.StreetName + '%' LIKE b.StreetName) AND b.Zip=m.Zip) AND b.DOB<>m.DOB THEN 'NameAddressMatch' ELSE 'DoubleMatch'ENDFROM Local26Members mINNER JOIN BRFull bON m.FirstName=b.FirstNameAND m.LastName=b.LastNameWHERE b.DOB=m.DOB AND m.TIMSSID IN ('BO016426', 'BO010882', 'BO014863','BO016347')OR (b.StreetNumber=m.StreetNumber AND (m.StreetName + '%' LIKE b.StreetName) AND b.Zip=m.Zip)CREATE PROCEDURE dbo.MemberMatch2 AS INSERT INTO MemberBRLMatches (TIMSSID,ResidentID,MatchType)SELECT m.TIMSSID,b.ResidentID,CASE WHEN '7/16/1967' ='7/16/1967' AND ('1' <>'1' OR ('CHELSEA%' NOT LIKE 'CHELSEA ST') OR '02128'<>'02128') THEN 'NameDOBMatch' WHEN ('1'='1' AND ('CHELSEA%' LIKE 'CHELSEA ST') AND '02128'='02128') AND '7/16/1967'<>'7/16/1967' THEN 'NameAddressMatch' ELSE 'DoubleMatch'ENDFROM Local26Members mINNER JOIN BRFull bON 'ANTHONY'='ANTHONY'AND 'DOE'='DOE'WHERE '7/16/1967'='7/16/1967'OR ('1'='1' AND ('CHELSEA%' LIKE 'CHELSEA ST') AND '02128'='02128') |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-06-12 : 23:44:26
|
| Are you sure you dont have space or any other un printable characters in m.StreetName . Try this too:-(b.StreetName LIKE '%'+ m.StreetName + '%') instead of(m.StreetName + '%' LIKE b.StreetName) |
 |
|
|
bwol
Starting Member
19 Posts |
Posted - 2008-06-13 : 12:08:24
|
| First off, I realized that one of my DOB (data of birth) fields was improperly named and now that I fixed that, the query is now returning "NameDOBMatches".However, I am still not getting any "NameAddressMatches" or "DoubleMatches", even though I know they exist. I modified the MemberMatch query by adding '%' before and after all instances of m.StreetName and b.StreetName, as shown below, but that doesn't improve anything. Here is what is really bothering me. Through my web hosting provider's control panel I have the ability to search my databases. I did the following search on both the BRFull and Local26Members database:LastName - Exactly Equal to - 'DOE'FirstName - Exactly Equal to - 'ANTHONY'StreetNumber - Exactly Equal to - '1'Zip - Exactly Equal to - '02128'StreetName - Contains - 'CHELSEA'This search returns a single record from each table (i.e. what should be a NameAddressMatch), which leads me to believe that there is something wrong with the query, not the table data (I meticulously searched to make sure that there is not other problems with field names). What could be wrong?Here is my current query:CREATE PROCEDURE dbo.MemberMatch AS INSERT INTO MemberBRLMatches (TIMSSID,ResidentID,MatchType)SELECT m.TIMSSID,b.ResidentID,CASE WHEN b.DOB=m.DOB AND (b.StreetNumber<>m.StreetNumber OR ('%' + m.StreetName + '%' NOT LIKE '%' + b.StreetName + '%') OR b.Zip<>m.Zip) THEN 'NameDOBMatch' WHEN (b.StreetNumber=m.StreetNumber AND ('%' + m.StreetName + '%' LIKE '%' + b.StreetName + '%') AND b.Zip=m.Zip) AND b.DOB<>m.DOB THEN 'NameAddressMatch' ELSE 'DoubleMatch'ENDFROM Local26Members mINNER JOIN BRFull bON m.FirstName=b.FirstNameAND m.LastName=b.LastNameWHERE b.DOB=m.DOBOR (b.StreetNumber=m.StreetNumber AND ('%' + m.StreetName + '%' LIKE b.StreetName+ '%') AND b.Zip=m.Zip) |
 |
|
|
bwol
Starting Member
19 Posts |
Posted - 2008-06-13 : 14:35:08
|
| For the record, I figured out the problem(s), which were with this statement:('%' + m.StreetName + '%' LIKE '%' + b.StreetName + '%')The first problem was that the order of m.StreetName and b.StreetName was wrong. Also a set of parentheses was missing. The full working query is listed below. Thanks so much to everyone who offered their help with the complicated (to me) query! CREATE PROCEDURE dbo.MemberMatch AS INSERT INTO MemberBRLMatches (TIMSSID,ResidentID,MatchType)SELECT m.TIMSSID,b.ResidentID,CASE WHEN (b.StreetNumber=m.StreetNumber AND ( ('%' + b.StreetName + '%') LIKE ('%' + m.StreetName + '%')) AND b.Zip=m.Zip) AND b.DOB<>m.DOB THEN 'NameAddressMatch' WHEN b.DOB=m.DOB AND (b.StreetNumber<>m.StreetNumber OR ( ('%' + b.StreetName + '%') NOT LIKE ('%' + m.StreetName + '%')) OR b.Zip<>m.Zip) THEN 'NameDOBMatch' ELSE 'DoubleMatch'ENDFROM Local26Members mINNER JOIN BRFull bON m.FirstName=b.FirstNameAND m.LastName=b.LastNameWHERE b.DOB=m.DOBOR (b.StreetNumber=m.StreetNumber AND ( ('%' + b.StreetName + '%') LIKE ('%' + m.StreetName + '%')) AND b.Zip=m.Zip) |
 |
|
|
|
|
|
|
|