Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
TRACS
Starting Member
5 Posts |
Posted - 2009-07-11 : 18:21:45
|
| I have a table that has a primary key that has duplicate records. I am trying to clean up this table so that I can put an identity index on the PK and fix this once and for all. There is a LAST_UPDATE_DT (datetime field) that is changed any time any part of the record is changed. I need to delete all of the duplicates and leave the record with that newest Last_Updated Date? All of the posts I have found are for records that are exactly the same (true duplicates) where my records are actually unique but have the same Primary Key value. I need to keep the record that has the latest changes. Your help is greatly appreciated for this rookie to the SQL Server world. Please don't assume that I know anything in your explanation.Table Structure:NRR_ID PKNRR_TITLE varcharNRR_CREATE_DT datetimeNRR_NARRATIVE varchar(max)LAST_UPDATE_DT datetimeThanks,TRACS |
|
|
webfred
Master Smack Fu Yak Hacker
8781 Posts |
Posted - 2009-07-11 : 18:37:17
|
delete tfrom table tjoin(selectnrr_id,last_update_dt,row_number() over (partition by nrr_id order by last_update_dt desc) as rownum)t2on t.nrr_id=t2.nrr_id and t.last_update_dt = t2.last_update_dt and t2.rownum > 1 No, you're never too old to Yak'n'Roll if you're too young to die. |
 |
|
|
TRACS
Starting Member
5 Posts |
Posted - 2009-07-11 : 18:55:34
|
| Webfred,Thanks for the quick response. I have a couple of questions about your code that I'd appreciate an explanation of:1. You wrote: "Delete T" I assume that I should write "Delete SDC_NARRATIVES" since this is the name of the Table2. You wrote: "from Table t" Again I should write "FROM SDC_NARRATIVES"3. You used row_number() over (partition by nrr_id order by last_update_dt desc) as rownum)t2 If I get this right when you use "t2" that is an alias for a temporary table? I see these aliases in alot of code posts and assume that it just makes it easier to write in response to people's questions without having to use their actual table names? If not than please explain.4. I want to delete all the rows with the same NRR_ID but not the row that has the last changes (as determined by the newest last_update_dt field. This is what your query does?I appreciate your simple explanation of your code so i can learn and understand this so I can use it in the future for other things. THe row_number() is a function that will give the records a unique value for deleting purposes later? and the Partition by function somehow says: "give me the record by nrr_id with the newest last_update_dt and then delete all the other records with the same nrr_id. Right? If so this is exactly what I need. In your join you don't list all the fields in the table. Is the delete accomplished on the whole row by the delete statement at the top? Thanks again for your help with this problem.TRACS |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2009-07-12 : 04:11:41
|
1. T is table alias. its not used as a short name for tables so that you dont have to repeat long table names everywhere in query. you can use them as it is given in actual query also.Your assumption on row_number and partition is correct. it will group data within table by partition by fields nrr_id and give values to each group based on order in order by (last_update_dt desc) so that latest record will 1 and going backwards 2,3,4,.. so when you delete > 1 it will delete all except latest one for each NRR_ID group.you dont have to specify column names in delete as you always delete as a row rather than individual columns.so you query should be something likedelete tfrom SDC_NARRATIVES tleft join(selectnrr_id,row_number() over (partition by nrr_id order by last_update_dt desc) as rownumfrom SDC_NARRATIVES)t2on t.nrr_id=t2.nrr_id and t.last_update_dt = t2.last_update_dt and t2.rownum = 1where t2.nrr_id is null before you apply this make sure you replace delete t in above with select t.* and see the records that you're going to delete to make sure thats exactly what you want |
 |
|
|
|
|
|