SQL Update behaviour - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

Transact-SQL (2008)

SQL Update behaviour

Author

Topic

dhanikprakash
Starting Member

2 Posts

Posted - 2011-07-07 : 05:59:52

Strange behaviour

declare @tempTable1 table
(
ID INT,
DATA1 VARCHAR(20),
DATA2 VARCHAR(20) ,
TM_DATE datetime
)

insert into @tempTable1
(ID,DATA1) values (1,'test data 1')
insert into @tempTable1
(ID,DATA2) values (1,'test data 2')
insert into @tempTable1
(ID,TM_DATE) values (1,getdate())

select * from @tempTable1

declare @tempTable2 table
(
ID INT ,
DATA1 VARCHAR(20),
DATA2 VARCHAR(20) ,
TM_DATE datetime
)
insert into @tempTable2
(ID) values (1)

select * from @tempTable2

UPDATE T2 SET
DATA1 = COALESCE (T1.DATA1,T2.DATA1),
DATA2 = COALESCE (T1.DATA2,T2.DATA2),
TM_DATE = COALESCE (T1.TM_DATE,T2.TM_DATE)
FROM
@tempTable1 T1
INNER JOIN
@tempTable2 T2
ON T2.ID = T1.ID

select * from @tempTable2

Once the update is completed @temptable2 had the following results

1 test data 1 test data 2 2011-07-07 09:02:25.277

Now set ID as primary key on the @temptable2 the update has the following results

1 test data 1 NULL NULL

Some please explain why this behaviour on update?

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2011-07-07 : 08:37:39

You're joining on ID. There are three records in table1 that match the join predicate.

So your update statement is ambiguous.

vaari
Starting Member

15 Posts

Posted - 2011-07-07 : 08:54:14

In addition to what Russel mentioned about your update being ambiguous - when you add the primary key (which by default is clustered) you end up doing a clustered index update which results in a sort operator which results in a different set of values being updated as opposed to what was happening without the primary key (or with a non-clustered primary key).

I hope that makes sense

dhanikprakash
Starting Member

2 Posts

Posted - 2011-07-07 : 09:12:45

Thanks Guys,

So if the primary key set on updating table i.e @temptable2, only one row is selected from the @temptable1 for update

If the primary key not set on @temptable2 all the rows from @temptable1 will be selected for update, COALESCE does the job correctly.

Is that correct.

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2011-07-07 : 09:21:57

I think what you have is ambiguous so it depends on how the optimiser decides to run the query. I don't think you can guarantee the results.
This should give a consistent result.

UPDATE T2 SET
DATA1 = COALESCE (T1.DATA1,T2.DATA1),
DATA2 = COALESCE (T1.DATA2,T2.DATA2),
TM_DATE = COALESCE (T1.TM_DATE,T2.TM_DATE)
FROM @tempTable2 T2
join (select ID, data1 = max(DATA1), data2 = max(DATA2), date = max(date) from @tempTable1 group by ID) T1
ON T2.ID = T1.ID

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.

vaari
Starting Member

15 Posts

Posted - 2011-07-07 : 09:27:25

You cannot deduce anything from this behaviour - it could very well behave differently in different scenarios (for e.g. different data distribution, difference in the number of rows etc)

Edit: My response is to the OPs original statement of whether the PK removal would result in the query working properly

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources