| Author |
Topic  |
|
|
ivra
Starting Member
Philippines
18 Posts |
Posted - 01/29/2010 : 04:37:41
|
Hi,
I have a 3 tables with person_id field on each table.
I created a new table (match_data) where all person_id from 3 tables matches.
My problem is how can I get the un match person_Id from the 3 tables.
I'm running a query now:
select distinct ftn.person_id,ftn.OptIn,dpa.Description as dpa_desc, c_dpa.Description as cdpa_desc into unmatch_data from ftn, dpa, c_dpa where ftn.person_id not in (select ftn.person_id from match_data)
order by ftn.person_id
The query is running for almost an hour now. Is there any other way I can get the mis matches using different faster approach?
Thanks all! |
|
|
Kristen
Test
United Kingdom
22191 Posts |
Posted - 01/29/2010 : 04:44:12
|
from ftn, dpa, c_dpa
Cartesian Join!!!!
This will link every record in FTN to every record in DPA, and then for every one of those it will link them to every row in C_DPA.
Are you sure that's what you want? |
 |
|
|
Kristen
Test
United Kingdom
22191 Posts |
Posted - 01/29/2010 : 04:47:00
|
Actaully, slightly inaccirate,. it will only take the rows from FTN that are not in the data from match_data, but that is wrong too I'm afraid:
not in (select ftn.person_id from match_data)
This is going to select the person_ID from the CURRENT FTN row, it gets nothing from match_data, and that will always be false ... so actually its going to match everything with everything from only the other two tables
... and then it will do a SORT of the whole lot and throw away the duplicates - which will take a little long while  |
Edited by - Kristen on 01/29/2010 04:47:58 |
 |
|
|
ivra
Starting Member
Philippines
18 Posts |
Posted - 01/29/2010 : 05:02:38
|
thank for pointing out my mistakes.
got what I want |
 |
|
|
mymatrix
Starting Member
India
24 Posts |
Posted - 01/29/2010 : 05:11:47
|
Please correct me if i have understood your requirement incorrectly You need below mentioned 4 columns for unmatch_data table: person_Id OptIn dpa_desc cdpa_desc
or you need two columns only with first column as person_Id value of second column changed to OptIn,dpa_desc,cdpa_desc depending on table (ftn,dpa,c_dpa). Please advice.
insert into unmatch_data select ftn.Person_Id, ftn.OptIn,default,default from ftn where ftn.Person_Id not in (select person_Id from match_Data) union select dpa.Person_Id, default,dpa.Description,default from dpa where dpa.Person_Id not in (select person_Id from match_Data) union select c_dpa.Person_Id, default,default,c_dpa.description from ftn where c_dpa.Person_Id not in (select person_Id from match_Data)
This sql statement may create some duplicate records but performance wise it would be faster. If this suits your requirment, let me know. I will improve it further.
thnks Gaurav
Even my blood group says be -ve to all the negatives. |
 |
|
|
ivra
Starting Member
Philippines
18 Posts |
Posted - 01/29/2010 : 05:25:26
|
I just run:
select distinct ftn.person_id,ftn.OptIn,dpa.Description as dpa_desc, c_dpa.Description as cdpa_desc into unmatch_data from ftn, dpa, c_dpa where ftn.person_id not in (select match_data.person_id from match_data) and dpa.person_id not in (select match_data.person_id from match_data) and c_dpa.person_id not in (select match_data.person_id from match_data)
order by ftn.person_id
but gives me duplicates..How can I rid of duplicates? |
 |
|
|
Kristen
Test
United Kingdom
22191 Posts |
Posted - 01/29/2010 : 05:45:30
|
I would use FULL OUTER JOIN
You could just store the ID from the three tables (DISTINCT will give you no duplicates) - but then you will have to LEFT JOIN it back to the three tables to find any other data.
Your problem is that your select list has columns like ftn.OptIn. That will have a value if the row comes from FTN, but no value if it comes from the other two tables. So there will never be an identical match between the rows from each table, and thus DISTINCT can't resolve that. |
Edited by - Kristen on 01/29/2010 05:51:40 |
 |
|
|
Kristen
Test
United Kingdom
22191 Posts |
Posted - 01/29/2010 : 05:48:44
|
Probably something like this:
select DISTINCT COALESCE(ftn.person_id, dpa.person_id, c_dpa.person_id),
ftn.OptIn,dpa.Description as dpa_desc, c_dpa.Description as cdpa_desc
into unmatch_data
from ftn
FULL OUTER JOIN dpa
ON dpa.person_id = ftn.person_id
FULL OUTER JOIN c_dpa
ON c_dpa.person_id = ftn.person_id
where COALESCE(ftn.person_id, dpa.person_id, c_dpa.person_id) not in (select match_data.person_id from match_data)
You will still get DUPs if there are multiple rows within each table for the same ID and the other columns have different values.
If there are lots of duplicates within EACH table the performance of this will be terrible. But hopefully person_id is Unique within each table. |
Edited by - Kristen on 01/29/2010 05:50:04 |
 |
|
| |
Topic  |
|