Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 SQL Update behaviour

Author  Topic 

dhanikprakash
Starting Member

2 Posts

Posted - 2011-07-07 : 05:59:52
Strange behaviour

declare @tempTable1 table
(
ID INT,
DATA1 VARCHAR(20),
DATA2 VARCHAR(20) ,
TM_DATE datetime
)

insert into @tempTable1
(ID,DATA1) values (1,'test data 1')
insert into @tempTable1
(ID,DATA2) values (1,'test data 2')
insert into @tempTable1
(ID,TM_DATE) values (1,getdate())

select * from @tempTable1

declare @tempTable2 table
(
ID INT ,
DATA1 VARCHAR(20),
DATA2 VARCHAR(20) ,
TM_DATE datetime
)
insert into @tempTable2
(ID) values (1)


select * from @tempTable2

UPDATE T2 SET
DATA1 = COALESCE (T1.DATA1,T2.DATA1),
DATA2 = COALESCE (T1.DATA2,T2.DATA2),
TM_DATE = COALESCE (T1.TM_DATE,T2.TM_DATE)
FROM
@tempTable1 T1
INNER JOIN
@tempTable2 T2
ON T2.ID = T1.ID

select * from @tempTable2


Once the update is completed @temptable2 had the following results

1 test data 1 test data 2 2011-07-07 09:02:25.277


Now set ID as primary key on the @temptable2 the update has the following results

1 test data 1 NULL NULL


Some please explain why this behaviour on update?

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2011-07-07 : 08:37:39
You're joining on ID. There are three records in table1 that match the join predicate.

So your update statement is ambiguous.
Go to Top of Page

vaari
Starting Member

15 Posts

Posted - 2011-07-07 : 08:54:14
In addition to what Russel mentioned about your update being ambiguous - when you add the primary key (which by default is clustered) you end up doing a clustered index update which results in a sort operator which results in a different set of values being updated as opposed to what was happening without the primary key (or with a non-clustered primary key).

I hope that makes sense
Go to Top of Page

dhanikprakash
Starting Member

2 Posts

Posted - 2011-07-07 : 09:12:45
Thanks Guys,

So if the primary key set on updating table i.e @temptable2, only one row is selected from the @temptable1 for update

If the primary key not set on @temptable2 all the rows from @temptable1 will be selected for update, COALESCE does the job correctly.

Is that correct.





Go to Top of Page

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2011-07-07 : 09:21:57
I think what you have is ambiguous so it depends on how the optimiser decides to run the query. I don't think you can guarantee the results.
This should give a consistent result.

UPDATE T2 SET
DATA1 = COALESCE (T1.DATA1,T2.DATA1),
DATA2 = COALESCE (T1.DATA2,T2.DATA2),
TM_DATE = COALESCE (T1.TM_DATE,T2.TM_DATE)
FROM @tempTable2 T2
join (select ID, data1 = max(DATA1), data2 = max(DATA2), date = max(date) from @tempTable1 group by ID) T1
ON T2.ID = T1.ID

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

vaari
Starting Member

15 Posts

Posted - 2011-07-07 : 09:27:25
You cannot deduce anything from this behaviour - it could very well behave differently in different scenarios (for e.g. different data distribution, difference in the number of rows etc)

Edit: My response is to the OPs original statement of whether the PK removal would result in the query working properly
Go to Top of Page
   

- Advertisement -