Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
seanh1016
Starting Member
14 Posts |
Posted - 2010-09-02 : 18:37:11
|
| any idea how to get rid of duplicates where I sort of define a duplicate as row 2 below because all entries are the same except person? DISTINCT seems to think this is a unique value, which, I guess it is; but I'd like to consider it a duplicate. example: I'd like to remove row 2 here: row 1: company1 dept1 customer1 amount1 person1row 2: company1 dept1 customer1 amount1 person2Thanks, in advanceSean |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2010-09-02 : 19:36:01
|
What version of SQL Server are you using?If 2005 or later you could try a Ranking function like ROW_NUMBER:SELECT company, dept, customer, amount, personFROM ( SELECT company, dept, customer, amount, person, ROW_NUMBER() OVER ( PARTITION BY company, dept, customer, amount, person ORDER BY person ) AS RowNum FROM MyTable ) AS TWHERE RowNum = 1 |
 |
|
|
seanh1016
Starting Member
14 Posts |
Posted - 2010-09-03 : 13:30:37
|
| tried to duplicate above with my table. but not working. is it b/c I have joins? here's what I have so far: SELECT distinctbc.dept, person, company, customer,amountFROM (select bc.dept, person, bd.company AS company, customer, amount, ROW_NUMBER() OVER(PARTITION BY dept, person, bd.company AS company, customer, amountORDER BY person) AS RowNumFROMtable1 AS bc INNER JOIN table2 AS bd ON bc.company = bd.company AND bc.BATCH = bd.BATCH LEFT OUTER JOIN person_data ON dept = person_data.company_ID COLLATE SQL_Latin1_General_CP1_CS_AS AND bc.company = person_data.company LEFT OUTER JOIN amount AS ad ON bd.amtrsn = ad.amtcode AND bd.company = ad.companyWHERE (RowNum = 1) and (bc.last_date between CONVERT(DATETIME, '2010-08-22 00:00:00', 102) and CONVERT(DATETIME, '2010-08-31 00:00:00', 102)) AND (bd.flag1 = 'A') AND (bd.flag2 = 'Y') and (dept = 'shoes') and (bd.company = 'co2') |
 |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2010-09-03 : 15:13:02
|
That query doesn't make any sense. But I rearranged it by guessing:SELECT dept, person, company, customer, amountFROM ( select bc.dept, person, bd.company AS company, customer, amount, ROW_NUMBER() OVER(PARTITION BY dept, person, bd.company, customer, amount ORDER BY person) AS RowNum FROM table1 AS bc INNER JOIN table2 AS bd ON bc.company = bd.company AND bc.BATCH = bd.BATCH LEFT OUTER JOIN person_data ON dept = person_data.company_ID COLLATE SQL_Latin1_General_CP1_CS_AS AND bc.company = person_data.company LEFT OUTER JOIN amount AS ad ON bd.amtrsn = ad.amtcode AND bd.company = ad.company WHERE (bc.last_date between CONVERT(DATETIME, '2010-08-22 00:00:00', 102) and CONVERT(DATETIME, '2010-08-31 00:00:00', 102)) AND (bd.flag1 = 'A') AND (bd.flag2 = 'Y') and (dept = 'shoes') and (bd.company = 'co2') ) AS TWHERE RowNum = 1 |
 |
|
|
jcelko
Esteemed SQL Purist
547 Posts |
Posted - 2010-09-04 : 01:13:37
|
quote: Originally posted by seanh1016 any idea how to get rid of duplicates where I sort of define a duplicate as row 2 below because all entries are the same except person? DISTINCT seems to think this is a unique value, which, I guess it is; but I'd like to consider it a duplicate. example: I'd like to remove row 2 here: row 1: company1 dept1 customer1 amount1 person1row 2: company1 dept1 customer1 amount1 person2
SELECT DISTINCT is right'; look at the data!Now what is your rule for discarding one row and not the others?--CELKO--Books in Celko Series for Morgan-Kaufmann PublishingAnalytics and OLAP in SQLData and Databases: Concepts in Practice Data, Measurements and Standards in SQLSQL for SmartiesSQL Programming Style SQL Puzzles and Answers Thinking in SetsTrees and Hierarchies in SQL |
 |
|
|
seanh1016
Starting Member
14 Posts |
Posted - 2010-09-04 : 01:21:34
|
| distinct is right in what sense? returning person1 and 2? I don't want that. I only want person1 row to return. I think lamprey nailed it with rank. |
 |
|
|
seanh1016
Starting Member
14 Posts |
Posted - 2010-09-07 : 18:53:37
|
| jcelko and lamprey, I have to eat my words a little here. I keep getting errors where RowNum is defined and then referenced in WHERE. I have MS SQL 2005. I'm a beginner in case you can't tell. Also, do I really need to alias the table as "T"? To summarize, I only want the first instance of the duplicate which should be based only on company dept customer amount. thx. |
 |
|
|
seanh1016
Starting Member
14 Posts |
Posted - 2010-09-09 : 12:22:05
|
| seems like above didn't work but this did: select*from (select...) as twhere RowNum=1 |
 |
|
|
|
|
|
|
|