Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Upsert proc, locking hints

Author  Topic 

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2006-10-01 : 18:56:42
Say we have an "UPSERT" proc that is subject to a large number of concurrent requests.
We want to avoid deadlocks and key errors.

Most of the code snippets I have seen for upserts only use updlock.
Possibly with a holdlock as well. (which is really needed imo!).

Why not just put a xlock straight away, since we are going to
either update or insert. Wouldn't that be more efficient?


proc outline:

create proc upsert
@someKey int
,@someVal int
as
set nocount on

begin transaction

/* now we don't want concurrent requests to be able to
A. insert duplicate keys
B. deadlock
We use the following hinsts:
holdlock + [xlock,updlock]
*/

if exists( select * from targetTable with([xlock,updlock],holdlock) -- xlock or updlock ?
where someKey = @someKey)
begin
-- row exists, do the update
update targetTable
set someVal = @someVal
where someKey = @someKey
-- handle error...
end
else
begin
-- row does not exists, do the insert
insert targetTable
(
someKey
,someVal
)
values
(
@someKey
,@someVal
)
-- handle error...
end

commit transaction


We have a proc, somewhat along the above lines, but it still deadlocks itself from time to time.
And I can't seem to understand why

rockmoose

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2006-10-01 : 23:07:36
I usually avoid the IF EXISTS test, and just try the one first that I expect to be the most common, INSERT or UPDATE. This way, the check is built into the statement, so there should be very little chance of a deadlock.


declare @error int
declare @rowcount int

begin transaction

insert targetTable
(
someKey,
someVal
)
sekect
a.someKey,
a.someVal
from
(
select
someKey = @someKey,
someVal = @someVal
) a
left join
targetTable b
on a.someKey = b.someKey
where
-- Verify row does not exist
b.someKey is null

select @error = @@error, @rowcount = @@rowcount

if @rowcount = 1
begin
commit
return 0
end

update targetTable
set
someVal = @someVal
where
someKey = @someKey and
someVal <> @someVal

select @error = @@error, @rowcount = @@rowcount

commit

return 0



CODO ERGO SUM
Go to Top of Page

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2006-10-02 : 02:49:43
Thanks, that makes a lot of sense.
In my simple testrig I had to lock the table to avoid PK violations though!
Which does seems strange, at least I think so atm!

...
insert targetTable
(
someKey,
someVal
)
sekect
a.someKey,
a.someVal
from
(
select
someKey = @someKey,
someVal = @someVal
) a
left join
targetTable b with([updlock/xlock],holdlock) -- without this, got PK violations
on a.someKey = b.someKey
where
-- Verify row does not exist
b.someKey is null
...


Above I feel like using the xlock.

EDIT:
Just the HOLDLOCK hint is necessary, forget about UPDLOCK and XLOCK.

rockmoose
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2006-10-02 : 06:11:28
How is the promary key being assigned?

Ut sounds like the real problem is that the same key is being used for multiple inserts.


CODO ERGO SUM
Go to Top of Page

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2006-10-02 : 06:25:28
The key is assigned/created in another system.
Then that other system uses the upsert proc like this:

begin tran exec upsert @key = 1, @val = 'a' commit
begin tran exec upsert @key = 1, @val = 'b' commit
begin tran exec upsert @key = 1, @val = 'c' commit

It posts these concurrently. (the same ms in profiler).

rockmoose
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2006-10-02 : 06:45:57
Is using the same key in the example a typo?

I would want all the upserts done in the same transaction. or better yet, in the same INSERT or UPDATE. Get it done with one DB call.


insert targetTable
(
someKey,
someVal
)
sekect
a.someKey,
a.someVal
from
(
select
someKey = @someKey_1,
someVal = @someVal_1
union all
select
someKey = @someKey_2,
someVal = @someVal_2
union all
select
someKey = @someKey_3,
someVal = @someVal_3
) a
left join
targetTable b
on a.someKey = b.someKey
where
-- Verify row does not exist
b.someKey is null





CODO ERGO SUM
Go to Top of Page

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2006-10-02 : 07:04:06
It's an automated process,
it posts a bunch of transactions, and doesn't care if it posts the same key several times with the same posting.
There is some date logic in the parameters as well, (like this data was last updated so&so).
The proc has some logic as to just update the data where last_updated is > than the current last_updated date.

I know, it's not optimal, or even very good.
But for the day the processing engine is not going to be rewritten, but the procedure is.

So, my question is solely on the PK violation issue and the HOLDLOCK hint.

Speaking generally, the code you posted seems subject to PK violations if HOLDLOCK is not specified.
And that struck me as strange in the first place.
And in the second place I was wondering what people do in this kind of upsert procs.

What kind of locking hints do you specify, if any?
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2006-10-02 : 08:19:44
I'm a little bit leary about rolling over from a failed-update to an insert (or vice-versa):

Try UPDATE
IF @@ROWCOUNT = 0 then do INSERT

is perhaps OK but

Try INSERT
IF @@ERROR <> 0 then do UPDATE

strikes me as dangerous because it implies that the operator did not know that a record pre-existing and they may be overwriting some critical data created by the earlier INSERT (or possibly earlier insert and several subsequent UPDATEs!)

We do allow updates where the previous state is not known, but these are rare (there is a flag on the SProc along the lines of @blnOverwriteDontCareWhatWasThereBefore !!) as we signal our UpSert Sprocs:

Create new (fail if exists)
Update from a known VersionNo/TIMESTAMP - fail if VersionNo is different
Update (record must pre-exist, but don't care what version)
UpSert (don't care if exists or not)

Kristen
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-10-02 : 08:55:20
my take on this:
Don't use upsert sprocs.
Use an Update sproc and an Insert sproc.




Go with the flow & have fun! Else fight the flow
blog thingie: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2006-10-02 : 09:11:39
Spirit: Do you include in that an Sproc that can do either, but takes a parameter saying which it should do?

We like the fact that all the logic (much of the validation is the same etc) to do with UpSert of a table is gathered in a single place ...

Kristen
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-10-02 : 09:17:48
no i don't.
plan caching messes itself up depending on what you do more.
if that isn't a concern then i'm for it.

Validation is DAL's job not sproc's.
what do you consider validation? maybe business logic validation?



Go with the flow & have fun! Else fight the flow
blog thingie: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2006-10-02 : 09:57:29
Why did I press REFRESH rather than SAVE? Damn

"plan caching messes itself up depending on what you do more"

I've been working on the assumption that SQLServer can't screw up a single record transaction based on the PK - never tested that assertion though!

"what do you consider validation? maybe business logic validation?"

Some of it is supplementing/replacing DAL - certainly nothing like "is this a valid date" - the DAL does that. But I would include stuff where data might be arriving "wonky" from some un-trusted source - but in general that goes through Staging tables and so on, so should arrive at UpSert in clean shape.

Order Save: If VoucherCode present check that it is valid for this customer and for the circumstances of the order.

Web registration: Check email address does not exist on another account.

And we do pre-check some FK stuff - does the WidgetCode exist in the WidgetTable. In the main the DAL takes care of this (e.g. by displaying a SELECT list in the first place!) but there are some more complicated scenarios where the DAL doesn't/can't handle it [with its current abilities at least]. We prefer to return a friendly error where there is likely to be something to catch (rather than a totally unanticipated error, which we treat as terminal) because it is easier [for us] to process a return code than catch & interpret stuff in the ADO.errors() collection or similar.

Kristen
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-10-02 : 10:29:17
well if you have atomic updates and insert then i see no problem.
but if you have a xml insert etc then caching comes in handy.

my personal prefernce is still 2 sprocs.
maybe because i've never had a project that had a lot of business logic in sporcs.



Go with the flow & have fun! Else fight the flow
blog thingie: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2006-10-02 : 11:57:46
Are the upserts being submitted on different connections? I guess that would explain the PK violations. If so, what is the mechanism to insure that the upserts are applied in the correct order? It's almost like the application is throwing data at the database to see which sticks.

Using HOLDLOCK has the same effect as this, so your application may have low concurrency.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

If this is mainly a data collection process with no need for real-time access to the data, you might consider just inserting the data into a transaction table with an IDENTITY key, and then use a scheduled batch job to process the data into your current table one row at a time in KEY sequence. That should eliminate PK violations, deadlocks, and other concurrency issues. Of course, ther are plenty of possible disadvantages to this approach, especially data-lag and throughput.





CODO ERGO SUM
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2006-10-02 : 12:24:52
What happened to Microsoft Queue (if I've remembered the name correctly), that might pip you the data in a sensible data with a guarantee of no data loss (if I've remembered what the marketing man told me 10 years ago!)

Kristen
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-10-02 : 12:35:43
forget MS Queue, Service broker is here.



Go with the flow & have fun! Else fight the flow
blog thingie: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2006-10-02 : 12:40:22
But I'm trying to avoid Service broken ...
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-10-02 : 12:44:01
why? you don't like new technology? what kind of a geek are you?



Go with the flow & have fun! Else fight the flow
blog thingie: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2006-10-02 : 13:07:47
Dyslexic geek ...
Go to Top of Page

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2006-10-02 : 16:14:14
Yes I've been thinking of alternative approaches along those lines. (some sort of queued processing).
I was watching profiler for a while 1200+ calls during 4 minutes, and many simultaneous ones (same ms).
It comes in bursts, but < 1mil a day so far.

It seems they just throw data, The ratio update:insert must be 90:10.
It might well be that a large % of the updates are "unnecessary". I don't know yet.


The initial problem was with rewriting the proc itself, to better handle the load, and thanks for all the input there.

I will be disussing this further with the devs.


We are looking into service-broker both here&there, but I am not seriously involved yet, we still have the upgrade (2K5) to do.
This particular system that we are communicating with is one candidate.
But just throwing another technology/product into the mix is not something I will allow until we sort some things out.

If they can't handle a gun,
I'm sure as hell not going to let them handle a machine-gun ;)

Next thing I know, I come to work and there are 40GB of unhandled xml messages

Actually I am a bit wary of "black-box" techniques such as SB.
> What's the problem?
> Stuck in a queue!

I agree with Mladen,
one update sproc
one insert sproc
Let the BL at least be savvy enough to know which one to call, how hard can it be!

And also with Kristen
We have lots of business logic in the procs.

rockmoose
Go to Top of Page
   

- Advertisement -