Clustered Primay Keys & Identity

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

General SQL Server Forums

New to SQL Server Programming

Clustered Primay Keys & Identity

Author

Topic

nachtschicht
Starting Member

8 Posts

Posted - 2010-09-14 : 05:07:02

Sorry - english version in my next post :-).

Hi,

ich bin gerade dabei eine Datenbank für eine neue Anwendung aufzusetzen. Das grundsätzliche Design steht soweit auch schon. Ich hänge allerdings ein wenig bei der Erzeugung neuer ID's für meine Tabellen. In vielen dieser Tabellen verwende ich clustered pk's um verschiedene Versionen eines Datensatzes speichern zu können. Eine solche Tabelle sieht z. B. so aus:


CREATE TABLE [dbo].[TESTS](
	[ID] [int] NOT NULL ,
	[Version] [int] NOT NULL,
	[Name] [nvarchar](max) NULL,
	[PlanedDays] [int] NULL,
	[StartDate] [datetime] NULL,
	[EndDate] [datetime] NULL,
	[ReportDate] [datetime] NULL,
	[Status] [int] NULL,
	[Active] [bit] NOT NULL,
	[ChangeDate] [datetime] NOT NULL,
	[ChangeID] [int] NOT NULL,
 CONSTRAINT [PK_TESTS] PRIMARY KEY CLUSTERED 
(
	[ID] ASC,
	[Version] ASC
)

ID und Version bilden also zusammen den Primärschlüssel. Jetzt stellt sich für mich die Frage, wie ich hier am besten (bei insert und updates (an sich gibt es ja keine wirklichen updates da nur die version eins hoch gezählt wird...) an neue IDs komme?

Die Wege die mir eingefallen sind wären folgende:
1. ID auf Identity stellen und eine function erstellen, die für einen Datensatz die höchste bereits verwendete Versionsnummer zurück gibt und diese um eins nach oben zählt.

2. Erstellen einer extra Tabelle "Sequenzer" und darin die (oracle-) sequenz Daten nachbilden. Abfragen und erhöhen dieser Sequenz mit einer function und eine weitere function wieder für die Versionsnummer.

Sind das meine einzigen Möglichkeiten oder gibt es noch einen besseren (goldenen) Weg? Welcher Weg wäre ggf. der bessere / performantere?

Danke für eure Hilfe!

:: www.Security-Blog.eu
:: www.MCSEboard.de

Sachin.Nand

2937 Posts

Posted - 2010-09-14 : 05:09:19

Below is the english translation

I'm just a database for a new application set up. The basic design is so far already. I hang quite a bit in the production of new ID's for my tables. In many of these tables store I use clustered pk's to different versions of a record can be. Such a table is, for example, like this:

CREATE TABLE [dbo]. [TESTING] (
[ID] [int] NOT NULL,
[Version] [int] NOT NULL,
[Name] [nvarchar] (max) NULL,
[PlanedDays] [int] NULL,
[StartDate] [datetime] NULL,
[EndDate] [datetime] NULL,
[ReportDate] [datetime] NULL,
[Status] [int] NULL,
[Active] [bit] NOT NULL,
[ChangeDate] [datetime] NOT NULL,
[ChangeID] [int] NOT NULL,
CONSTRAINT [PK_TESTS] PRIMARY KEY CLUSTERED
(
[ID] ASC,
[Version] ASC
)

ID and version together form the primary key. Now the question arises for me, as I here the best (for insert and updates (in itself, there is indeed no real updates will be counted as only one version up ...) to get new IDs?

The ways that I would have invaded the following:
1st ID set to identity and create a function that gives a record for the highest version number used back already and this is one to one up.

2nd Create an extra table "sequencer" and to the (oracle) sequence data replicate. Queries increase, and this sequence with a function and another function again for the version number.

Are those my only options or is there a better (golden) way? Which way is possibly the better / higher performance?

Thanks for your help!

Limitations live only in our minds. But if we use our imaginations, our possibilities become limitless.

PBUH

nachtschicht
Starting Member

8 Posts

Posted - 2010-09-14 : 05:11:19

Oh haha just recocnised that i wrote the question in german :-) I am going to translate it. Sorry. :D

:: www.Security-Blog.eu
:: www.MCSEboard.de

nachtschicht
Starting Member

8 Posts

Posted - 2010-09-14 : 05:23:34

Hi,

sorry for the german version above.

I am currently developing a new application which uses a MS SQL 2008 R2 database as backend. The general design of the database is ready to use but I am struggling to generate the new ID's for my tables.

In the vast majority of my tables I am using clustered primary keys in order to be able to save all versions of a dataset.


CREATE TABLE [dbo].[TESTS](
	[ID] [int] NOT NULL ,
	[Version] [int] NOT NULL,
	[Name] [nvarchar](max) NULL,
	[PlanedDays] [int] NULL,
	[StartDate] [datetime] NULL,
	[EndDate] [datetime] NULL,
	[ReportDate] [datetime] NULL,
	[Status] [int] NULL,
	[Active] [bit] NOT NULL,
	[ChangeDate] [datetime] NOT NULL,
	[ChangeID] [int] NOT NULL,
 CONSTRAINT [PK_TESTS] PRIMARY KEY CLUSTERED 
(
	[ID] ASC,
	[Version] ASC
)

There are two potential solutions comming to my mind:

1. Make the ID field an IDENTIY field and write a function which queries the highest used Version number for this ID and add 1.

2. Simulate a sequenz by creating a new table which holds the latest used ID und query through it for every new element. Use the above mentioned function in order to get the new version number.

Is there a better way to perform this task and if not which of the mentioned ways would be the best?

Thanks for your help!

:: www.Security-Blog.eu
:: www.MCSEboard.de

Kristen
Test

22859 Posts

Posted - 2010-09-14 : 05:48:32

"Below is the english translation

I'm just a database for a new application set up."

It is? You are a Google Translator and I claim my free Lorem Ipsum pen!

Kristen
Test

22859 Posts

Posted - 2010-09-14 : 06:00:37

I don't know if its a good idea, but you could have a Trigger that changes the Version from some "special number" (not a much loved concept, I'll be the first to agree) to the next available number. e.g. VERSION=-1 could cause the Trigger to change that to MAX(VERSION)+1 WHERE TESTS.ID = INSERTED.ID

-1 could be the Default.

(using 9999999 instead of -1 might be better in that sense that the revised value will be in the same place in the Clustered Index - actually, maybe -1 will be too?)

You could use an Instead Of Trigger to avoid saving the record and then changing its Clustered Index value, but I'm not a fan of Instead Of Triggers - need maintenance whenever a column is added etc. and have a variety of annoying side effects.

A function that gets the next value is prone to a race-condition between two users, so brings it own set of problems.

We have used next-available-number tables in order to have a application-provided number instead of using the system IDENTITY property (e.g. where two tables must share from the same ascending ID pool so that no record in either table has duplicate ID). They have the additional benefit that you can ask for a "range of numbers" ahead of time - "I have ten records to insert, given me next available number and reserve me 10 numbers" - but I don't think there are any benefits like that in your situation.

Insert using a Stored Procedure could do the work I indicated for the Trigger above - but then you must be sure that 100% of inserts go through the trigger (which is unrealistic if you have bulk imports and and other, potential even, Admin Data Manipulation Stuff)


INSERT INTO TEST(ID, Version, Name, ...)
SELECT @ID,
       [Version] = COALESCE((SELECT MAX(Version) FROM TEST WHERE ID = @ID), 0) + 1,
       @Name,
       ...

Sachin.Nand

2937 Posts

Posted - 2010-09-14 : 06:03:36

quote:
Originally posted by Kristen

"Below is the english translation

I'm just a database for a new application set up."

It is? You are a Google Translator and I claim my free Lorem Ipsum pen!

Limitations live only in our minds. But if we use our imaginations, our possibilities become limitless.

PBUH

nachtschicht
Starting Member

8 Posts

Posted - 2010-09-14 : 06:26:57

Hi Kirsten!

Thanks for your reply. The entiry application modell restricts the database access to stored procedures (no direct access to tables & views at all) (security reasons - we are talking about a client application).

So you basically suggesting to not use IDENTITY and get the nextID using a select statement than?


declare @ID int

set @ID = (Select MAX(ID)+1 from dbo.tests)

INSERT INTO [test01].[dbo].[TESTS]
           ([ID]
           ,[Version]
           ,[Name]
           ,[PlanedDays]
           ,[StartDate]
           ,[EndDate]
           ,[ReportDate]
           ,[Status]
           ,[Active]
           ,[ChangeDate]
           ,[ChangeID])
     VALUES
           (@ID
           ,1
           ,'test'
           ,20
           ,GETDATE()
           ,GETDATE()
           ,GETDATE()
           ,1
           ,1
           ,GETDATE()
           ,1)
GO

... and in case an update is required do the same for the Version (ID is than given...).

Whats the reason for not using IDENTITY and is the way above not danger in terms of concurent access to a table?

:: www.Security-Blog.eu
:: www.MCSEboard.de

Kristen
Test

22859 Posts

Posted - 2010-09-14 : 07:17:20

Your way is at risk from current race conditions - you might get the same ID as has already been used.

If you use IDENTITY you cannot (easily) have two records the same ID but with different VERSION numbers

But, yes, you do need to come up with a way to allocate the next-available-ID number, in addition to the next-available-Version-number for a pre-existing ID number

nachtschicht
Starting Member

8 Posts

Posted - 2010-09-14 : 07:34:00

Hi,

a friend just suggested to use uniqueidentifiers + Version - what do you think about that?

:: www.Security-Blog.eu
:: www.MCSEboard.de

Kristen
Test

22859 Posts

Posted - 2010-09-14 : 08:13:12

Yes, you could do that. GUIDs are quite a bit larger, so impact on index density. Best to use a ~~sequence~~ sequential GUID (otherwise you have other issues with performance issues with split pages). If you are allocating the GUIDs outside SQL (i.e. in Application) then make sure you have a Unique index on that column - should be OK if generating within SQL, but even so best to assume that the GUID algorithm may produce a duplicate as its theoretically possible (although probably not with sequential GUIDs)

Dunno what happens when you move to another server - I expect the possibility of duplicates re-arises - albeit the odds are vanishingly small.

edit: brain-fade!

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2010-09-14 : 09:54:34

You'll probably want to use sequential GUID's as well. Otherwise every insert would mean shuffling the primary key.

Use NEWSEQUENTIALGUID() rather than NEWID()

Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION

nachtschicht
Starting Member

8 Posts

Posted - 2010-09-16 : 06:53:28

Thanks for your help. I have changed my db-design in order to use uniqueidentifiers. That makes my life much easier.

:: www.Security-Blog.eu
:: www.MCSEboard.de

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2010-09-16 : 06:56:15

Did you use sequential unique identifiers?

If not you'll probably get insert performance problems at some point.

Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION

nachtschicht
Starting Member

8 Posts

Posted - 2010-09-16 : 07:01:34

The application might need to be offline capable (in the future). Therefore I am going to use newid() just to be prepared...

:: www.Security-Blog.eu
:: www.MCSEboard.de

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources