Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
loneblade
Starting Member
3 Posts |
Posted - 2010-04-15 : 14:15:05
|
I have a table with 4 columns as a result from a fuzzy match. The first 2 columns contain the same set of data as the last 2. The column names are Category1, ID1, Category2, and ID2. The combination of category and id are need to identify uniqueness in the record that I will be joining with later. Is there a way to to select only the unique combination where that if [Combination of Category and ID] 1 and [Combination of Category and ID] 2 shows up in the one row, the reverse will not appear in the later rows? Hopefully this makes sense. |
|
hanbingl
Aged Yak Warrior
652 Posts |
Posted - 2010-04-15 : 18:04:03
|
[CODE]DECLARE @T1 TABLE (CAT1 VARCHAR(1), ID1 INT, CAT2 VARCHAR(1), ID2 INT)INSERT INTO @T1SELECT 'A',1,'B',2UNION ALLSELECT 'B',2,'A',1UNION ALLSELECT 'C',2,'A',1UNION ALLSELECT 'D',2,'A',1SELECT CAT1, ID1, CAT2, ID2 FROM @T1 ZWHERE NOT EXISTS (SELECT 1 FROM(SELECT A.*, RANK() OVER(ORDER BY A.CAT1) AS RNK FROM @T1 AJOIN @T1 B ON A.CAT1 = B.CAT2AND A.ID1 = B.ID2AND A.CAT2 = B.CAT1AND A.ID2 = B.ID1)TWHERE Z.CAT1 = T.CAT1 AND Z.ID1 = T.ID1 AND Z.CAT2 = T.CAT2 AND Z.ID2 = T.ID2AND RNK > 1)[/CODE]quote: Originally posted by loneblade I have a table with 4 columns as a result from a fuzzy match. The first 2 columns contain the same set of data as the last 2. The column names are Category1, ID1, Category2, and ID2. The combination of category and id are need to identify uniqueness in the record that I will be joining with later. Is there a way to to select only the unique combination where that if [Combination of Category and ID] 1 and [Combination of Category and ID] 2 shows up in the one row, the reverse will not appear in the later rows? Hopefully this makes sense.
|
 |
|
jcelko
Esteemed SQL Purist
547 Posts |
Posted - 2010-04-20 : 12:50:37
|
This would make a lot more sense with sample data, clear specs and DDL (keys, constriatns, all that stuff you don't get in a vague narrative).>> I have a table with 4 columns as a result from a fuzzy match. The first 2 columns contain the same set of data as the last 2. <<Then the last two columns are redundant>> The column names are Category1, ID1, Category2, and ID2. The combination of category and id are need to identify uniqueness in the record [sic] that I will be joining with later. <<Can I change the data element names to follow ISO-11179 rules instead of having repeating groups implied by array subscripts? I am fond of First Normal Form. >> Is there a way to to select only the unique combination where that if [Combination of Category and ID] 1 and [Combination of Category and ID] 2 shows up in the one row, the reverse will not appear in the later rows? <<Since a table has no ordering, what does "later rows" mean to you? Now, in a sequential file "following records" would be correct. You also used the term "record" before. >> Hopefully this makes sense.<<Nope. After a few decades of reading bad SQL and specs, I think you might mean something like this:For a set with two columns, we do not want to have the same combination (as opposed to permutation) of values. That is{(salt, pepper), (pepper, salt)} are a redundancy. In SQL we can prohibit this with DDL:CREATE TABLE Condiments(first_condiment_name CHAR(15) NOT NULL, second_condiment_name CHAR(15) NOT NULL, PRIMARY KEY (first_condiment_name, second_condiment_name), CHECK (first_condiment_name < second_condiment_name));Your four-column table sounds like an attempt to extend this model. Did I guess right? A cute trick is to create a derived table with the UNION of the flip of the column and a GROUP BY:SELECT DISTINCT X.c1, X.c2 FROM (SELECT first_condiment_name, second_condiment_name FROM Condiments UNION SELECT second_condiment_name, first_condiment_name FROM Condiments) AS X(c1, c2) GROUP BY X.c1, X.c2HAVING COUNT(*) >= 2;--CELKO--Joe Celko, SQL Guru |
 |
|
|
|
|
|
|