SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 Site Related Forums
 Article Discussion
 Article: Efficiently Reuse Gaps in an Identity Column
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

AskSQLTeam
Ask SQLTeam Question

USA
0 Posts

Posted - 02/09/2010 :  08:50:18  Show Profile  Visit AskSQLTeam's Homepage  Reply with Quote
This article will demonstrate an efficient way to reuse gaps in an identity column. Please note that this is something you normally shouldn't be bothered about in a well-designed database or application. However, there are circumstances where you are forced to do this.

Read Efficiently Reuse Gaps in an Identity Column

mjda
Starting Member

6 Posts

Posted - 02/09/2010 :  11:43:58  Show Profile  Reply with Quote
Nice article. I usually avoid running out of identity range by doing something like:
CREATE TABLE dbo.Yaks(YakID bigint identity(-9223372036854775808,1),YakName char(20))
:)
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30281 Posts

Posted - 02/09/2010 :  17:28:04  Show Profile  Visit SwePeso's Homepage  Reply with Quote
Yes, and using 8 bytes of storage instead of 4.
For 200 million records, that is a saving of 763MB in data storage, and another 763mb index storage.
763 mb storage space is 100 000 less pages to store, seek and scan.




N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

USA
7020 Posts

Posted - 02/09/2010 :  19:48:50  Show Profile  Reply with Quote
quote:
Originally posted by mjda

Nice article. I usually avoid running out of identity range by doing something like:
CREATE TABLE dbo.Yaks(YakID bigint identity(-9223372036854775808,1),YakName char(20))
:)



Is starting a bigint identity with a negative number seed really necessary in an practical application?

Using just the positive numbers, a table with only a bigint identify column and no other columns would use 67,108,864 TB of storage before you ran out of numbers. And it would take over 290 years to load that table at a rate of 1 billion rows per second.








CODO ERGO SUM
Go to Top of Page

ya3mro
Starting Member

37 Posts

Posted - 05/18/2010 :  02:06:59  Show Profile  Visit ya3mro's Homepage  Click to see ya3mro's MSN Messenger address  Send ya3mro a Yahoo! Message  Reply with Quote
very excellent solution to such a usual problem

How I Came To Islam?
http://english.islamway.com
Go to Top of Page

Celko
Starting Member

USA
23 Posts

Posted - 07/29/2010 :  12:16:37  Show Profile  Visit Celko's Homepage  Reply with Quote
Since I don't use IDENTITY or other metadata in my DDL, this is not a problem for me javascript:insertsmilie('') .

However, when we wanted to find gaps in a sequence for re-use, the old trick was:

CREATE VIEW Vacancies(vacant_seq)
AS
SELECT vacant_seq
FROM (SELECT seq +1
FROM Foobar
WHERE seq +1
NOT IN (SELECT seq FROM Foobar)
AND seq+1 <= <<upper limit of seq>>
UNION
SELECT seq -1
FROM Foobar
WHERE seq -1
NOT IN (SELECT seq FROM Foobar)
AND seq-1 >= 1;

It does not get all of the open slots, but it was good enough when you needed a quick answer for a few openings in a relatively small table. The other trick was

SELECT (<<upper limit of seq>> - COUNT(seq)) AS vacancy_cnt
FROM Foobar;

Today, we would probably use ROW_NUMBER() to update the seq and put all of the vacancies on the high end of the table:

UPDATE Foobar
SET seq = ROW_NUMBER() OVER (ORDER BY seq ASC);


Author of
SQL for Smarties
Thinking in Sets
SQL Puzzles and Answers
Analytics and OLAP in SQL
SQL Programming Style
Standards in SQL
Data & Databases
Trees & Hierarchies in SQL
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 07/29/2010 :  12:30:35  Show Profile  Reply with Quote
WELCOME JOE!!!!!!!!!!

Have a beer on me...Love your books



Stop by the Corracl and say hi!

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=148000



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam



Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30281 Posts

Posted - 07/30/2010 :  00:54:30  Show Profile  Visit SwePeso's Homepage  Reply with Quote
The idea is to update ONLY the records who needs to be updated.
Why update all 100 million records when there are only 10 or 20 records to update?



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Celko
Starting Member

USA
23 Posts

Posted - 07/30/2010 :  18:04:34  Show Profile  Visit Celko's Homepage  Reply with Quote
quote:
Originally posted by Peso

The idea is to update ONLY the records who needs to be updated.
Why update all 100 million records [sic: rows] when there are only 10 or 20 records [sic: rows] to update?


Because in the old days, we never had big tables

I am not sure how SQL server will treat this, but you could also write Standard SQL to save some effort:

UPDATE Foobar
SET seq = ROW_NUMBER() OVER (ORDER BY seq ASC)
WHERE seq <> ROW_NUMBER() OVER (ORDER BY seq ASC);


Author of
SQL for Smarties
Thinking in Sets
SQL Puzzles and Answers
Analytics and OLAP in SQL
SQL Programming Style
Standards in SQL
Data & Databases
Trees & Hierarchies in SQL
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30281 Posts

Posted - 07/30/2010 :  18:44:26  Show Profile  Visit SwePeso's Homepage  Reply with Quote
Create a table with 9 rows, numbered 1 to 9. Now remove row 2.

Seq    ROW_NUMBER
1      1
3      2
4      3
5      4
6      5
7      6
8      7
9      8
It still means you are updating 89% of the table.
With my suggestion you only update 1 record (Seq 9 becomes Seq 2).


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Jeff Moden
Aged Yak Warrior

USA
649 Posts

Posted - 11/26/2010 :  22:24:32  Show Profile  Reply with Quote
Peter. This isn't an attack but I have to ask, why would you really want to reuse IDENTITY values or their equivalent? Why even bother having such a column if it means so little that you can reuse and reassign the numbers? Like I said, not an attack... just an insane amount of curiosity.

--Jeff Moden
"Your lack of planning DOES constitute an emergency on my part... SO PLAN BETTER! "
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row"

Go to Top of Page

Jeff Moden
Aged Yak Warrior

USA
649 Posts

Posted - 11/26/2010 :  22:31:35  Show Profile  Reply with Quote
quote:
Originally posted by Celko

quote:
Originally posted by Peso

The idea is to update ONLY the records who needs to be updated.
Why update all 100 million records [sic: rows] when there are only 10 or 20 records [sic: rows] to update?


Because in the old days, we never had big tables

I am not sure how SQL server will treat this, but you could also write Standard SQL to save some effort:

UPDATE Foobar
SET seq = ROW_NUMBER() OVER (ORDER BY seq ASC)
WHERE seq <> ROW_NUMBER() OVER (ORDER BY seq ASC);



Now that looks like fun! I've gotta try that. I agree with Peter that will update many rows unncessarily if the goal is only reuse of numbers but your code will also guarantee no gaps.

I also have to ask you, especially since your so anti-mag-tape-technology... Why do this? Isn't the reuse of such identifier columns against what an RDBMS is supposed to be all about?

--Jeff Moden
"Your lack of planning DOES constitute an emergency on my part... SO PLAN BETTER! "
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row"

Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30281 Posts

Posted - 11/27/2010 :  16:20:54  Show Profile  Visit SwePeso's Homepage  Reply with Quote
Yes it is, but the question was brought up from another MVP which had a table for which they were quickly running out of numbers.
Changing to BIGINT was not an option. Neither was changing the seed to -2147483648 due to the clustered index on the identity column.



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Jeff Moden
Aged Yak Warrior

USA
649 Posts

Posted - 11/27/2010 :  18:29:49  Show Profile  Reply with Quote
In that case, why move data at all? Why not just look for the gaps an insert new rows there. That can also be done quite efficiently and you can use the Tally table to quickly expand the endpoints of gaps to a range of insert-values.

--Jeff Moden
"Your lack of planning DOES constitute an emergency on my part... SO PLAN BETTER! "
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row"

Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.09 seconds. Powered By: Snitz Forums 2000