VARCHAR field as primary key column

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

Old Forums

CLOSED - General SQL Server

VARCHAR field as primary key column

Author

Topic

kkr12
Starting Member

12 Posts

Posted - 2006-02-22 : 15:28:52

Hi,
I've a table with a primary key column of VARCHAR field. The JBOSS app server is throwing some deadlocks while using this table and so we got a reccomendation from JBOSS to add a new column of INT type and make it the primary key column instead of VARCHAR field as VARCHAR field as primary key in SQLserver is very bad. Is this really true ? Can someone point me to any documentation/articles which says the primary key column in SQLserver should not be VARCHAR field ?

Thanks,
kkr

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2006-02-22 : 15:49:53

Well ints are best for performance, but varchar is fine too. How big of a varchar column are we talking about?

Changing your primary key is probably not going to resolve your deadlocks. To resolve them, you'll need to follow the suggestions in SQL Server Books Online for resolving deadlocks.

Tara Kizer
aka tduggan

eyechart
Master Smack Fu Yak Hacker

3575 Posts

Posted - 2006-02-22 : 16:02:03

I don't see how a PK on a varchar could be causing deadlocks. Are we actually talking deadlocks, or just blocking that is occuring which is causing delays?

-ec

kkr12
Starting Member

12 Posts

Posted - 2006-02-22 : 16:13:06

The field is of VARCHAR(40). The developers tested it on their schema by adding a int column and making it as primary key. It looks like they are not getting these errors anymore. This is really strange for me as I don't understand what difference it makes the locks to int vs varchar field.

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2006-02-22 : 16:28:26

Could you post the error that they were receiving prior to the change? Moving from varchar(40) to int certainly would not have resolved deadlocks.

Tara Kizer
aka tduggan

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2006-02-22 : 16:30:21

quote:

.... as VARCHAR field as primary key in SQLserver is very bad.

Utter bullshit..

DavidM

Intelligent Design is NOT science.

A front-end is something that tries to violate a back-end.

X002548
Not Just a Number

15586 Posts

Posted - 2006-02-22 : 16:51:30

quote:
Originally posted by byrmol

quote:

.... as VARCHAR field as primary key in SQLserver is very bad.

Utter bullshit..

DavidM

Intelligent Design is NOT science.

A front-end is something that tries to violate a back-end.

No, really, tell us how you feel about it...

Who is JBOSS?

Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam

eyechart
Master Smack Fu Yak Hacker

3575 Posts

Posted - 2006-02-22 : 18:09:23

JBOSS is an open-source application server.

-ec

eyechart
Master Smack Fu Yak Hacker

3575 Posts

Posted - 2006-02-22 : 18:19:21

kkr,

I was just thinking about this a little more and I was wondering which jdbc driver are you using with JBOSS? Are you using the microsoft one (pure crap btw) or are you using jtds?

-ec

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2006-02-22 : 20:53:32

The first thing I would want to know about this VARCHAR(40) primary key is what is the source of the key? Is it a natural key for the data you are working with, or is it being generated somehow?

If the application is doing something like storing the "next key" value in a small table, and updating that table before every insert, you could be getting deadlocks with that, especially it that table is being used to hold the "next key" value for multiple tables.

If that is what is happening, then changing to integer IDENTITY columns should fix the deadlock problem.

CODO ERGO SUM

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2006-02-22 : 22:10:03

I hope we haven't scared kkr12 away, I'm very curious to see where this one ends up ...

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2006-02-22 : 22:20:17

quote:
Originally posted by Michael Valentine Jones

The first thing I would want to know about this VARCHAR(40) primary key is what is the source of the key? Is it a natural key for the data you are working with, or is it being generated somehow?

If the application is doing something like storing the "next key" value in a small table, and updating that table before every insert, you could be getting deadlocks with that, especially it that table is being used to hold the "next key" value for multiple tables.

If that is what is happening, then changing to integer IDENTITY columns should fix the deadlock problem.

CODO ERGO SUM

I had the exact same thought and questions, Michael. But the thing I that stumped me was unless they also change the application code how does the fix work?

Be One with the Optimizer
TG

kkr12
Starting Member

12 Posts

Posted - 2006-02-23 : 09:33:53

Thanks all. Actually I do not know which jdbc driver is beign used and I asked application developers to provide me that info.

The key is a natural one. The table we are having issues is used to store sequence names , start and max values(simulating the sequences in oracle). So the table has below columns, like

sequence_name
minvalue
maxvalue
incrementby
currvalue
etc

The sequence_name is the primary key column. The application is trying to update the current value to currval+1 where sequence_name='XXX'.

Once we update the value, there is a select statement to fetch the row using the sequence_name in where clause. That's what the applicaton code is doing. The most confusing thing here is how the deadlock will be resolved by using another primary key of integer data type . The devlopers added a new column of int data type and changed it to primary key and the deadlocks are gone!

Thanks again,
kkr

Kristen
Test

22859 Posts

Posted - 2006-02-23 : 09:52:25

"update the current value to currval+1 where sequence_name='XXX'"

That's potentially going to lock a large part of the table ...

Are they still doing that, or are they just going with the IDENTITY number that is automatically allocated?

"The key is a natural one"

This could also potentially slow stuff down. IF the keys are inserted "all over the table" (rather than just at one end) AND if the housekeeping on the table is non existent, THEN the structure of the table will become inefficient over time. Some defragging or Reindexing would be called for ... which in turn would need some Updating of Statistics, and probably some Recompiling of procedures ...

"The devlopers added a new column of int data type and changed it to primary key and the deadlocks are gone!"

Dunno. I can imagine that it would change the nature of some things, but as has been said earlier that is only the case if the application has been changed to use the new PK

It might be that a clustered index on a natural key was being hammered previously and that was causing some blocking ... and maybe that escalated to deadlocks ...

... or it might be that it just needed housekeeping, and creating ANY new PK - even one with identical columns to the old one

- would have fixed the problem!! That would be a laugh, wouldn't it?!

Kristen

kkr12
Starting Member

12 Posts

Posted - 2006-02-23 : 11:59:32

I think I missed something when I said changing the PK eliminated the deadlock issue.

When they added the new column and they also changed the code.

The table now looks like this

ID seuenceName currval maxval ....
1 'XXX' 1 1000
2 'YYY' 1 1000

So the code before was(with primary key on sequencename field)

UPDATE TABLE_A SET CURRVAL=CURRVAL+1 WHERE SEQUENCENAME='XXX'

SELECT * FROM TABLE_A WHERE SEQUENCENAME='XXX'

The code after changing the PK to ID column is

SELECT ID FROM TABLE_A WHERE SEQUENCENAME='XXX'

UPDATE TABLE_A SET CURRVAL=CURRVAL+1 WHERE ID = (value from above select)

SELECT * FROM TABLE_A WHERE ID = (vlaue from above selct)

The data into this table is inserted once by a script and no new records will be inserted through application. So ID column is not an IDENTITY column. Hope this clears the problem little bit.

Kristen
Test

22859 Posts

Posted - 2006-02-23 : 14:11:24

So SEQUENCENAME is unique?

I have to presume that SEQUENCENAME was the PK before AND was a clustered index.

If so the only conclusion I can draw was that the maintenance before was rubbish - perhaps that SEQUENCENAME was "random" such that insertion was "all over" the table, whereas the new IDENTITY will only insert at the end of the table.

If SEQUENCENAME was NOT the PK then the WHERE SEQUENCENAME='XXX' stuff would have been blocking loads of rows whilst trying to solve the query ... which might have led to deadlocks ... I suppose ...

Sorry, can't be more specific.

Are you in a position to RESTORE an old backup of the database to a [new] temporary database? If so it would be possible to run some queries and look at the Optimiser's Query Plan - that would help shed some light on it. There is a definitely possibility, IMHO, that creation of a new PK has masked some poor maintenance procedures, and only delayed the renewed onset of poor performance ...

Kristen

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources