Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
ernest.l.arrow
Starting Member
4 Posts |
Posted - 2011-10-24 : 10:40:24
|
| I have wide and long table with semi dups.A duplicate is defined as if there are two rows in which row1.column3=row2.column3 and row1.column1=row2.column2 and row2.column1=row1.column2Other columns are not important.I don't care if column1=column2 or not, I have control over that.How do I delete every other row?Each row has apx. 1 dup, therefore, I'll remain with about half of the data.I thought of:myIndexColumn = Column1 + Column2 + Column3myReversedIndexColumn = Column2 + Column1 + Column3CREATE NONCLUSTERED INDEX myIndex1 ON myDB.dbo.myTable ( myIndexColumn ASC)CREATE NONCLUSTERED INDEX myIndex2 ON myDB.dbo.myTable ( myReversedIndexColumn ASC)DELETE FROM myDB.dbo.myTableWHERE myIndexColumn in (SELECT myReversedIndex FROM myDB.dbo.myTable)but for that I need the SQL SERVER to make a GO after each row tested (and deleted).How do I do that? |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2011-10-24 : 11:07:12
|
you just need thisSELECT *FROM(SELECT ROW_NUMBER() OVER (PARTITION BY column1,column2,column3 ORDER BY column1) AS Rn,*FROM Table)tWHERE Rn >1 check and ensure this returns duplicate occurances.then use it in deleteDELETE tFROM(SELECT ROW_NUMBER() OVER (PARTITION BY column1,column2,column3 ORDER BY column1) AS RnFROM Table)tWHERE Rn >1 ------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
ernest.l.arrow
Starting Member
4 Posts |
Posted - 2011-10-24 : 13:07:53
|
Thanks, but (0 row(s) affected).Here is an example to split by space:Column1 Column2 Column3 Columns4-8 Columns9-12 myIndex DecisionDavid Angela 82 15 53 David+Angela+82 Keep(Single)David Angela 89 15 53 David+Angela+89 Keep(Single)David Don 85 15 63 Don+David+85 Keep(First)Don David 85 63 15 Don+David+85 Remove(Second)David Don 92 15 63 Don+David+92 Keep(First)Don David 92 63 15 Don+David+92 Remove(Second)Ernest Ernest 83 47 47 Ernest+Ernest+83 Keep(Single)(Column1=Column2)Ernest Ernest 90 47 47 Ernest+Ernest+90 Keep(Single)(Column1=Column2)quote: Originally posted by visakh16 you just need thisSELECT *FROM(SELECT ROW_NUMBER() OVER (PARTITION BY column1,column2,column3 ORDER BY column1) AS Rn,*FROM Table)tWHERE Rn >1 check and ensure this returns duplicate occurances.then use it in deleteDELETE tFROM(SELECT ROW_NUMBER() OVER (PARTITION BY column1,column2,column3 ORDER BY column1) AS RnFROM Table)tWHERE Rn >1 ------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/
|
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2011-10-24 : 13:40:21
|
| when you say first second etc on what basis you identify them? is there a unique valued filed you can use for that?------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
namman
Constraint Violating Yak Guru
285 Posts |
Posted - 2011-10-24 : 14:17:17
|
| Using subquery to generate a new column that indicates the duplicated rows, then use that column for partition as visakh16 presents.If you need more help on this issue, let me know. |
 |
|
|
|
|
|
|
|