| Author |
Topic |
|
rn5a
Starting Member
25 Posts |
Posted - 2008-10-25 : 06:15:50
|
| CREATE TABLE [dbo].[MyTable]( [Name] [varchar](50) NULL, [Age] [smallint] NULL) ON [PRIMARY]------------------------------------INSERT INTO MyTable VALUES ('Aaron', 34)INSERT INTO MyTable VALUES ('Aaron', 34)INSERT INTO MyTable VALUES ('Betty', 26)INSERT INTO MyTable VALUES ('Clark', 29)INSERT INTO MyTable VALUES ('Clark', 29)INSERT INTO MyTable VALUES ('Diana', 33)INSERT INTO MyTable VALUES ('Diana', 33)INSERT INTO MyTable VALUES ('Elena', 41)------------------------------------To delete duplicates from the above table, this is the query:------------------------------------SET ROWCOUNT 1DELETE MyTableFROM MyTable aWHERE (SELECT COUNT(*) FROM MyTable b WHERE b.name = a.name AND b.age = a.age) > 1WHILE @@ROWCOUNT > 0 DELETE MyTable FROM MyTable a WHERE (SELECT COUNT(*) FROM MyTable b WHERE b.name = a.name AND b.age = a.age) > 1SET ROWCOUNT 0------------------------------------Can someone please explain me step-by-step how does the above query delete duplicates?Thanks,Ron |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-10-25 : 07:13:58
|
Beter method is thisDELETE tFROM(SELECT ROW_NUMBER() OVER (PARTITION BY Name,Age ORDER BY Name) AS Seq,*FROM MyTable)tWHERE t.Seq>1 what it does is organise the data intable into groupsof Name Age and number each with help of row_number() functions. So the duplicate records will get number 1,2,3... I delete those which have number >1. this will remove all records that existed as duplicates within each group and retain only first one (Seq=1) |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-10-25 : 07:19:05
|
| also learn other uses of row_number() functionhttp://sqlblogcasts.com/blogs/madhivanan/archive/2007/08/27/multipurpose-row-number-function.aspx |
 |
|
|
gsrinivas.
Yak Posting Veteran
56 Posts |
Posted - 2008-10-25 : 08:22:38
|
| hai,very thanks for deleting duplicates in single query....GSrinivas. |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-10-25 : 09:25:11
|
quote: Originally posted by gsrinivas. hai,very thanks for deleting duplicates in single query....GSrinivas.
you're welcome |
 |
|
|
rn5a
Starting Member
25 Posts |
Posted - 2008-10-27 : 01:43:02
|
quote: Originally posted by visakh16 Beter method is thisDELETE tFROM(SELECT ROW_NUMBER() OVER (PARTITION BY Name,Age ORDER BY Name) AS Seq,*FROM MyTable)tWHERE t.Seq>1 what it does is organise the data intable into groupsof Name Age and number each with help of row_number() functions. So the duplicate records will get number 1,2,3... I delete those which have number >1. this will remove all records that existed as duplicates within each group and retain only first one (Seq=1)
Thanks, mate, for introducing me to the ROW_NUMBER function.Can it be said that the PARTITION BY clause groups the identical rows based on the columns supplied to the PARTITION BY clause?Thanks,Ron |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-10-27 : 02:13:08
|
| yup. it partitions the table data horizontally into groups based on fields specified on partition by. |
 |
|
|
rn5a
Starting Member
25 Posts |
Posted - 2008-10-31 : 05:11:51
|
quote: Originally posted by visakh16 yup. it partitions the table data horizontally into groups based on fields specified on partition by.
Then what is the difference between the GROUP BY & PARTITION BY clauses?Thanks,Ron |
 |
|
|
madhivanan
Premature Yak Congratulator
22864 Posts |
Posted - 2008-10-31 : 05:19:37
|
quote: Originally posted by rn5a
quote: Originally posted by visakh16 yup. it partitions the table data horizontally into groups based on fields specified on partition by.
Then what is the difference between the GROUP BY & PARTITION BY clauses?Thanks,Ron
PARTITION BY is used to reset the ranking numbersMadhivananFailing to plan is Planning to fail |
 |
|
|
|