Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 unique/distinct values based on 4 of 5 columns

Author  Topic 

seanh1016
Starting Member

14 Posts

Posted - 2010-09-02 : 18:37:11
any idea how to get rid of duplicates where I sort of define a duplicate as row 2 below because all entries are the same except person?

DISTINCT seems to think this is a unique value, which, I guess it is; but I'd like to consider it a duplicate.

example: I'd like to remove row 2 here:
row 1: company1 dept1 customer1 amount1 person1
row 2: company1 dept1 customer1 amount1 person2

Thanks, in advance
Sean

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2010-09-02 : 19:36:01
What version of SQL Server are you using?

If 2005 or later you could try a Ranking function like ROW_NUMBER:
SELECT
company,
dept,
customer,
amount,
person
FROM
(
SELECT
company,
dept,
customer,
amount,
person,
ROW_NUMBER() OVER
(
PARTITION BY company, dept, customer, amount, person
ORDER BY person
) AS RowNum
FROM
MyTable
) AS T
WHERE
RowNum = 1
Go to Top of Page

seanh1016
Starting Member

14 Posts

Posted - 2010-09-03 : 13:30:37
tried to duplicate above with my table. but not working. is it b/c I have joins? here's what I have so far:

SELECT
distinct
bc.dept, person, company, customer,amount
FROM (select
bc.dept, person, bd.company AS company, customer, amount,
ROW_NUMBER() OVER(PARTITION BY
dept, person, bd.company AS company, customer, amount
ORDER BY person) AS RowNum
FROM
table1 AS bc INNER JOIN
table2 AS bd ON bc.company = bd.company AND bc.BATCH = bd.BATCH LEFT OUTER JOIN
person_data ON dept = person_data.company_ID COLLATE SQL_Latin1_General_CP1_CS_AS AND
bc.company = person_data.company LEFT OUTER JOIN
amount AS ad ON bd.amtrsn = ad.amtcode AND bd.company = ad.company
WHERE (RowNum = 1) and (bc.last_date between CONVERT(DATETIME, '2010-08-22 00:00:00', 102) and CONVERT(DATETIME, '2010-

08-31 00:00:00', 102)) AND (bd.flag1 = 'A') AND (bd.flag2 = 'Y') and (dept = 'shoes') and (bd.company = 'co2')
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2010-09-03 : 15:13:02
That query doesn't make any sense. But I rearranged it by guessing:
SELECT 
dept,
person,
company,
customer,
amount
FROM
(
select
bc.dept,
person, bd.company AS company,
customer,
amount,
ROW_NUMBER() OVER(PARTITION BY dept, person, bd.company, customer, amount ORDER BY person) AS RowNum
FROM
table1 AS bc
INNER JOIN
table2 AS bd
ON bc.company = bd.company
AND bc.BATCH = bd.BATCH
LEFT OUTER JOIN
person_data
ON dept = person_data.company_ID COLLATE SQL_Latin1_General_CP1_CS_AS
AND bc.company = person_data.company
LEFT OUTER JOIN
amount AS ad
ON bd.amtrsn = ad.amtcode
AND bd.company = ad.company
WHERE
(bc.last_date between CONVERT(DATETIME, '2010-08-22 00:00:00', 102) and CONVERT(DATETIME, '2010-08-31 00:00:00', 102))
AND (bd.flag1 = 'A')
AND (bd.flag2 = 'Y')
and (dept = 'shoes')
and (bd.company = 'co2')
) AS T
WHERE
RowNum = 1
Go to Top of Page

jcelko
Esteemed SQL Purist

547 Posts

Posted - 2010-09-04 : 01:13:37
quote:
Originally posted by seanh1016

any idea how to get rid of duplicates where I sort of define a duplicate as row 2 below because all entries are the same except person?

DISTINCT seems to think this is a unique value, which, I guess it is; but I'd like to consider it a duplicate.

example: I'd like to remove row 2 here:
row 1: company1 dept1 customer1 amount1 person1
row 2: company1 dept1 customer1 amount1 person2




SELECT DISTINCT is right'; look at the data!

Now what is your rule for discarding one row and not the others?



--CELKO--
Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Go to Top of Page

seanh1016
Starting Member

14 Posts

Posted - 2010-09-04 : 01:21:34
distinct is right in what sense? returning person1 and 2? I don't want that. I only want person1 row to return. I think lamprey nailed it with rank.
Go to Top of Page

seanh1016
Starting Member

14 Posts

Posted - 2010-09-07 : 18:53:37
jcelko and lamprey, I have to eat my words a little here. I keep getting errors where RowNum is defined and then referenced in WHERE. I have MS SQL 2005. I'm a beginner in case you can't tell. Also, do I really need to alias the table as "T"? To summarize, I only want the first instance of the duplicate which should be based only on company dept customer amount. thx.
Go to Top of Page

seanh1016
Starting Member

14 Posts

Posted - 2010-09-09 : 12:22:05
seems like above didn't work but this did:
select*
from (select...) as t
where RowNum=1

Go to Top of Page
   

- Advertisement -