Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Comparing columns for equality

Author  Topic 

JaybeeSQL
Posting Yak Master

112 Posts

Posted - 2013-10-04 : 11:10:10
HI all,

I've got a DB that came with no FK's and no docu, so I need to rebuild those relationships. To do that I need to compare 2+ ID columns in different tables (suspected PK and suspected FK) for matches.

Ideally I'd like to see what % of the (suspected) FK is comprised of those values in the (suspected) PK. Is an inner join the best way to achieve this?

Also - and I this probably involves a big Cartesian join as there are 100 tables - is there a way to loop through ALL columns named '%id' throughout the DB, with a result set like,

'ColumnName' - 'ColumnName' - 'MatchingRowCount' - '%OfFKMatched'

I'm not worried about performance at this point because the DB is only 36Mbs,

Cheers,

JB

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-10-04 : 11:37:01
If the tables are using surrgate keys for primary key, this approach can lead misleading results. Think of the case where the PK's are integer numbers starting at 1. Completely dissimilar and unrelated tables could have matching values.

Regardless, if you want to compare columns from two tables and see what percentage matches, you can do the following:
SELECT
COUNT(a.colA),
COUNT(b.colB),
100.0*SUM(CASE WHEN a.colA IS NOT NULL AND b.colB IS NOT NULL THEN 1 ELSE 0 END) /NULLIF(COUNT(b.colB),0) AS AHasMatchesForB,
100.0*SUM(CASE WHEN a.colA IS NOT NULL AND b.colB IS NOT NULL THEN 1 ELSE 0 END) /NULLIF(COUNT(a.colA),0) AS BHasMatchesForA
FROM
TableA a
FULL JOIN TableA b ON a.colA = b.colB;
You could use dynamic SQL with ms_sp_MSforeachtable to go through all combinations of tables.
Go to Top of Page

JaybeeSQL
Posting Yak Master

112 Posts

Posted - 2013-10-04 : 18:50:42
I appreciate that James, me being a purely Production guy I admit I don't fully grasp the whole programming logic, nor have I yet tested the statement, but shouldn't the full join clause nominate TableB as b, not TableA as b ?

Also, if you don't mind, let me test my understanding (I'm in the throes of a long, LONG overdue improvement of my TSQL)...

The statement joins all values on both columns in question, gets their rowcounts then derives a percentage value..but the division has me flummoxed!
Go to Top of Page
   

- Advertisement -