Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 How to delete duplicated data but remain all other

Author  Topic 

y0zh
Yak Posting Veteran

60 Posts

Posted - 2010-04-23 : 15:39:12
IDNAME | CITY | DATEB | COL | PRICE | VAL | DAT_INS
12253141 DAA 4/13/2007 200 447.22 F 4/13/07 5:06 AM
12253141 DAA 4/13/2007 200 447.22 F 4/13/07 5:07 AM

12253141 DAA 4/15/2007 160 447.22 T 4/13/07 8:06 AM
12253141 DAA 4/13/2007 200 447.22 F 4/13/07 5:08 AM
12253141 DSA 4/13/2007 180 525.14 F 4/13/07 5:06 AM
12253141 DSA 4/13/2007 250 626.18 F 4/13/07 5:06 AM


Red rows are duplicated.
Could you give me the query, that delete old rows (with time 5:06 AM and 5:07 AM)
and remain the newest (the last) row

It could be intelligent query. Query like

delete from table where DAT_INS>='4/13/07 5:08 AM' and
DAT_INS<'4/13/07 8:06 AM'

is not the best for this task because of this data '4/13/07 8:06 AM'.

I should work (I mean delte) only with duplicated data...
Any idea?

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2010-04-23 : 15:42:38
;WITH CTE AS (SELECT *, Rank = ROW_NUMBER() OVER(PARTITION BY IDNAME, CITY, DATEB, COL, PRICE, VAL ORDER BY DAT_INS DESC) FROM table_name)
DELETE FROM CTE
WHERE Rank > 1;
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2010-04-23 : 15:46:24
actually no need of CTE

DELETE t FROM (SELECT Rank = ROW_NUMBER() OVER(PARTITION BY IDNAME, CITY, DATEB, COL, PRICE, VAL ORDER BY DAT_INS DESC) FROM table_name) t
WHERE Rank > 1

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2010-04-23 : 15:48:26
Nice! I did not know (or I forgot) we can use derived table for deleting data.
Derived table and common table expression both are table expressions.
Go to Top of Page

y0zh
Yak Posting Veteran

60 Posts

Posted - 2010-04-23 : 15:51:13

quote:
actually no need of CTE

DELETE t FROM (SELECT Rank = ROW_NUMBER() OVER(PARTITION BY IDNAME, CITY, DATEB, COL, PRICE, VAL ORDER BY DAT_INS DESC) FROM table_name) t
WHERE Rank > 1


I receive

Server: Msg 195, Level 15, State 10, Line 1
'ROW_NUMBER' is not a recognized function name.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2010-04-23 : 15:53:13
quote:
Originally posted by y0zh


quote:
actually no need of CTE

DELETE t FROM (SELECT Rank = ROW_NUMBER() OVER(PARTITION BY IDNAME, CITY, DATEB, COL, PRICE, VAL ORDER BY DAT_INS DESC) FROM table_name) t
WHERE Rank > 1


I receive

Server: Msg 195, Level 15, State 10, Line 1
'ROW_NUMBER' is not a recognized function name.


are you using SQL 2005? if yes check if compatibility level is 90

SELECT @@VERSION

EXEC sp_dbcmptlevel 'your database name'

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

y0zh
Yak Posting Veteran

60 Posts

Posted - 2010-04-23 : 15:54:06
quote:
WITH CTE AS (SELECT *, Rank = ROW_NUMBER() OVER(PARTITION BY IDNAME, CITY, DATEB, COL, PRICE, VAL ORDER BY DAT_INS DESC) FROM table_name)
DELETE FROM CTE
WHERE Rank > 1;


Server: Msg 156, Level 15, State 1, Line 1
Incorrect syntax near the keyword 'WITH'.
Server: Msg 195, Level 15, State 1, Line 1
'ROW_NUMBER' is not a recognized function name.
Go to Top of Page

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2010-04-23 : 15:54:34
Because you use SQL Server 2000.

Try this:

SELECT *
INTO #temp
FROM table_name
WHERE 1=2

INSERT INTO #temp(IDNAME, CITY, DATEB, COL, PRICE, VAL, DAT_INS)
SELECT IDNAME, CITY, DATEB, COL, PRICE, VAL, MAX(DAT_INS)
FROM table_name
GROUP BY IDNAME, CITY, DATEB, COL, PRICE, VAL

TRUNCATE TABLE table_name

INSERT INTO table_name SELECT * FROM #temp
Go to Top of Page

y0zh
Yak Posting Veteran

60 Posts

Posted - 2010-04-23 : 15:54:49
mssql 2000
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2010-04-23 : 15:56:39
In any case this will work

DELETE t
FROM Table t
LEFT JOIN (SELECT IDNAME, CITY, DATEB, COL, PRICE, VAL,MAX(DAT_INS ) AS MaxDate
FROM Table
GROUP BY IDNAME, CITY, DATEB, COL, PRICE, VAL) t1
ON t1.IDNAME = t.IDNAME
AND t1.CITY = t.CITY
AND t1.DATEB = t.DATEB
AND t1.COL = t.COL
AND t1.PRICE = t.PRICE
AND t1.VAL = t.VAL
AND t1.MaxDate = t.DAT_INS
WHERE t1.IDNAME IS NULL


------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2010-04-23 : 16:15:34
also

DELETE FROM t
FROM table_name t
WHERE NOT EXISTS
(SELECT 1
FROM table_name t1
GROUP BY IDNAME, CITY, DATEB, COL, PRICE, VAL
HAVING t.IDNAME = IDNAME
AND t.CITY = CITY
AND t.DATEB = DATEB
AND t.COL = COL
AND t.PRICE = PRICE
AND t.VAL = VAL
AND MAX(DAT_INS) = t.DAT_INS)


EDIT: EXISTS --> NOT EXISTS
Go to Top of Page

y0zh
Yak Posting Veteran

60 Posts

Posted - 2010-04-23 : 16:33:18
Thank a lot!

The last question from me. I strive to promise ))

If somebody ask you what is wrong with the data that I provided above, what do you think you should answer?

Perhaps, Denormalized table?

Your answer?
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2010-04-23 : 16:40:50
Well it's hard to say without knowing the other tables involved but the city column should probably be a foreign key to a city table so that you don't replicate data unnescessarily and so that you know that you are talking about the same city.

I've got no idea what ID and COL columns represent but I guess the ID is a foreign key to another table? COL -- no idea?


Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

y0zh
Yak Posting Veteran

60 Posts

Posted - 2010-04-23 : 17:19:49
And what column you will choose for nonclustered and clustered index in the table?
Go to Top of Page

y0zh
Yak Posting Veteran

60 Posts

Posted - 2010-04-23 : 17:47:53
Any idea?
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2010-04-24 : 00:39:55
quote:
Originally posted by y0zh

And what column you will choose for nonclustered and clustered index in the table?


you mean for the DELETE operation to perform better or after the DELETE with unique records?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

y0zh
Yak Posting Veteran

60 Posts

Posted - 2010-04-24 : 05:21:30
I mean index that could improve performance SELECT for this data

IDNAME | CITY | DATEB | COL | PRICE | VAL | DAT_INS
12253141 DAA 4/13/2007 200 447.22 F 4/13/07 5:06 AM
12253141 DAA 4/13/2007 200 447.22 F 4/13/07 5:07 AM
12253141 DAA 4/15/2007 160 447.22 T 4/13/07 8:06 AM
12253141 DAA 4/13/2007 200 447.22 F 4/13/07 5:08 AM
12253141 DSA 4/13/2007 180 525.14 F 4/13/07 5:06 AM
12253141 DSA 4/13/2007 250 626.18 F 4/13/07 5:06 AM

So , what column will you chose for clustered index and what column you will chose for nonclustered index?
Go to Top of Page

kashyapsid
Yak Posting Veteran

78 Posts

Posted - 2010-04-24 : 07:10:23
set rowcount 1
delete from tab1
where (select count(*) from tab1 a where tab.column_name =
a.column_name) > 1
while @@rowcount<>0
delete from tab1 where (select count(*) from tab1 a where tab.column_name =
a.tab.column_name ) > 1
set rowcount 0

KaShYaP
Go to Top of Page

kashyapsid
Yak Posting Veteran

78 Posts

Posted - 2010-04-24 : 07:10:59
i think it will be suitable to the best of my knowledge

KaShYaP
Go to Top of Page

y0zh
Yak Posting Veteran

60 Posts

Posted - 2010-04-24 : 07:37:25
thanks

and what about this question
quote:
I mean index that could improve performance SELECT for this data

IDNAME | CITY | DATEB | COL | PRICE | VAL | DAT_INS
12253141 DAA 4/13/2007 200 447.22 F 4/13/07 5:06 AM
12253141 DAA 4/13/2007 200 447.22 F 4/13/07 5:07 AM
12253141 DAA 4/15/2007 160 447.22 T 4/13/07 8:06 AM
12253141 DAA 4/13/2007 200 447.22 F 4/13/07 5:08 AM
12253141 DSA 4/13/2007 180 525.14 F 4/13/07 5:06 AM
12253141 DSA 4/13/2007 250 626.18 F 4/13/07 5:06 AM

So , what column will you chose for clustered index and what column you will chose for nonclustered index?
Go to Top of Page
   

- Advertisement -