SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 How to get data mis matches
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

ivra
Starting Member

Philippines
18 Posts

Posted - 01/29/2010 :  04:37:41  Show Profile  Reply with Quote
Hi,

I have a 3 tables with person_id field on each table.

I created a new table (match_data) where all person_id from 3 tables matches.

My problem is how can I get the un match person_Id from the 3 tables.

I'm running a query now:

select distinct ftn.person_id,ftn.OptIn,dpa.Description as dpa_desc, c_dpa.Description as cdpa_desc
into unmatch_data
from ftn, dpa, c_dpa
where ftn.person_id not in (select ftn.person_id from match_data)

order by ftn.person_id

The query is running for almost an hour now. Is there any other way I can get the mis matches using different faster approach?

Thanks all!

Kristen
Test

United Kingdom
22403 Posts

Posted - 01/29/2010 :  04:44:12  Show Profile  Reply with Quote

from ftn, dpa, c_dpa

Cartesian Join!!!!

This will link every record in FTN to every record in DPA, and then for every one of those it will link them to every row in C_DPA.

Are you sure that's what you want?
Go to Top of Page

Kristen
Test

United Kingdom
22403 Posts

Posted - 01/29/2010 :  04:47:00  Show Profile  Reply with Quote
Actaully, slightly inaccirate,. it will only take the rows from FTN that are not in the data from match_data, but that is wrong too I'm afraid:

not in (select ftn.person_id from match_data)

This is going to select the person_ID from the CURRENT FTN row, it gets nothing from match_data, and that will always be false ... so actually its going to match everything with everything from only the other two tables

... and then it will do a SORT of the whole lot and throw away the duplicates - which will take a little long while

Edited by - Kristen on 01/29/2010 04:47:58
Go to Top of Page

ivra
Starting Member

Philippines
18 Posts

Posted - 01/29/2010 :  05:02:38  Show Profile  Reply with Quote
thank for pointing out my mistakes.

got what I want
Go to Top of Page

mymatrix
Starting Member

India
24 Posts

Posted - 01/29/2010 :  05:11:47  Show Profile  Reply with Quote
Please correct me if i have understood your requirement incorrectly
You need below mentioned 4 columns for unmatch_data table:
person_Id
OptIn
dpa_desc
cdpa_desc

or you need two columns only with first column as person_Id value of second column changed to OptIn,dpa_desc,cdpa_desc depending on table (ftn,dpa,c_dpa). Please advice.

insert into unmatch_data
select ftn.Person_Id, ftn.OptIn,default,default from ftn where ftn.Person_Id not in (select person_Id from match_Data)
union
select dpa.Person_Id, default,dpa.Description,default from dpa where dpa.Person_Id not in (select person_Id from match_Data)
union
select c_dpa.Person_Id, default,default,c_dpa.description from ftn where c_dpa.Person_Id not in (select person_Id from match_Data)

This sql statement may create some duplicate records but performance wise it would be faster. If this suits your requirment, let me know. I will improve it further.


thnks
Gaurav

Even my blood group says be -ve to all the negatives.
Go to Top of Page

ivra
Starting Member

Philippines
18 Posts

Posted - 01/29/2010 :  05:25:26  Show Profile  Reply with Quote
I just run:

select distinct ftn.person_id,ftn.OptIn,dpa.Description as dpa_desc, c_dpa.Description as cdpa_desc
into unmatch_data
from ftn, dpa, c_dpa
where ftn.person_id not in (select match_data.person_id from match_data)
and dpa.person_id not in (select match_data.person_id from match_data)
and c_dpa.person_id not in (select match_data.person_id from match_data)

order by ftn.person_id

but gives me duplicates..How can I rid of duplicates?
Go to Top of Page

Kristen
Test

United Kingdom
22403 Posts

Posted - 01/29/2010 :  05:45:30  Show Profile  Reply with Quote
I would use FULL OUTER JOIN

You could just store the ID from the three tables (DISTINCT will give you no duplicates) - but then you will have to LEFT JOIN it back to the three tables to find any other data.

Your problem is that your select list has columns like ftn.OptIn. That will have a value if the row comes from FTN, but no value if it comes from the other two tables. So there will never be an identical match between the rows from each table, and thus DISTINCT can't resolve that.

Edited by - Kristen on 01/29/2010 05:51:40
Go to Top of Page

Kristen
Test

United Kingdom
22403 Posts

Posted - 01/29/2010 :  05:48:44  Show Profile  Reply with Quote
Probably something like this:

select DISTINCT COALESCE(ftn.person_id, dpa.person_id, c_dpa.person_id),
       ftn.OptIn,dpa.Description as dpa_desc, c_dpa.Description as cdpa_desc
into unmatch_data
from ftn
     FULL OUTER JOIN dpa
         ON dpa.person_id = ftn.person_id
     FULL OUTER JOIN c_dpa
         ON c_dpa.person_id = ftn.person_id
where COALESCE(ftn.person_id, dpa.person_id, c_dpa.person_id) not in (select match_data.person_id from match_data)

You will still get DUPs if there are multiple rows within each table for the same ID and the other columns have different values.

If there are lots of duplicates within EACH table the performance of this will be terrible. But hopefully person_id is Unique within each table.

Edited by - Kristen on 01/29/2010 05:50:04
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.06 seconds. Powered By: Snitz Forums 2000