Most efficient way of creating a Numbers table? - SQL Server Forums

Home | Weblogs | Forums

Site Sponsored By: SQLDSC - SQL Server Desired State Configuration

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Transact-SQL (2005)

Most efficient way of creating a Numbers table?

Author

Topic

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2007-07-17 : 09:04:16

So, I'm playing around with optimizing some code, and in it, 2 table variable 'numbers' table are created by:
WHILE @I < @MAX INSERT into SET@I=@I+1 logic.

I figure there must be a more efficient mechanism. So far, I've got a semi-set based iterative - inserts increasing sets, thus decreasing duration considerably. BUT, is there a better way? Could CTE be better? Is there another way?

The following 2 example, on my laptop - the semi-set base GenTable1 took approx 3 seconds, while the iterative inserts took approx 2m08s:


create proc GenTable1 @max int
as 
begin
	declare @Min int
	set @Min = 1
	set @Max = 8000
	DECLARE @T TABLE (Number int NOT NULL PRIMARY KEY)
	INSERT @T VALUES (@Min) -- Seed the table with the min value
	WHILE @@ROWCOUNT > 0	--Loop until all the rows are created, inserting ever more records for each iteration (1, 2, 4, etc)
		BEGIN
		INSERT @T 
	--	  --Get the next values by adding the current max - start value + 1 to each existing number
	--	  --need to calculate increment value first to avoid arithmetic overflow near limits of int
		  SELECT t.Number + (x.MaxNumber - @Min + 1)
		  FROM	@T t 
				CROSS JOIN (SELECT MaxNumber = MAX(Number) FROM @T) x --Current max
		  WHERE
			--Do not exceed the Max - shift the increment to the right side to take advantage of index
			t.Number <= @Max - (x.MaxNumber - @Min + 1)
		END
	select count(*) from @T
end
go

create proc GenTable2 @max int
as
begin
	DECLARE @T2 TABLE (Number int NOT NULL PRIMARY KEY)

	declare @I int
	set @I = 1
	WHILE @I <= 8000 begin
		insert into @T2 (Number) values (@I)
		set @i = @I+1
	END
	select count(*) from @T2
end
go

exec GenTable1 8000
exec GenTable2 8000

Anyone able to improve on that? I'm keen to see a neater, more performant answer. I think the semi-set based starts winning from a size of about 7 or so...

*##* *##* *##* *##*
Chaos, Disorder and Panic ... my work is done here!

khtan
In (Som, Ni, Yak)

17689 Posts

Posted - 2007-07-17 : 09:06:24

How about this http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=47685 ?

KH
[spoiler]Time is always against us[/spoiler]

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2007-07-17 : 09:08:30

Thanks KH - didn't find that when searching - possibley because I got 13 pages of results on 'NUMBER TABLE' :)

I'll throw it into the mix and see how it handles

*##* *##* *##* *##*
Chaos, Disorder and Panic ... my work is done here!

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2007-07-17 : 09:18:04

Difficult to prove for definitely atm, but I am sure that Michaels is better. generating query plans is problematic for comparison, due to the number of plans generated by iteratives - Michaels shows 99% of all execution, but runs in <1 second, whereas other version takes <2 seconds.

I will need to consider the support side - Michaels is significantly more complex, but I think the end answer will (hopefully) be the best performing solution.

anyone got anything better?

*##* *##* *##* *##*
Chaos, Disorder and Panic ... my work is done here!

nr
SQLTeam MVY

12543 Posts

Posted - 2007-07-17 : 10:47:17

for simplicity

with n(i)
as
(select i = 0
union all
select i = i + 1 from n where i < 32000
)
....
option (maxrecursion 0)

Should be faster than the loop.

or try

with n(rc,i)
as
(
select rc = 1, i = 0
union all
select rc = 1, i = i + 1 from n where rc = 1 and i < 32000
union allselect rc = 2, i = i + 32001 from n where rc = 1 and i < 32000
)

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2007-07-18 : 04:47:07

NR, thanks.

I've modifed your code slightly, so align with the versions I have.

I may chuck all 4 options (I'm throwing away the iterative, I think), into a testing harnass, then test N executions to determine the average. Gut feel is they might be slowers than Michaels function, but slightly faster than my initial version.

*##* *##* *##* *##*
Chaos, Disorder and Panic ... my work is done here!

pootle_flump

1064 Posts

Posted - 2007-07-18 : 05:31:21

Well - I use this (cartesian product of spt_values). http://sqljunkies.com/WebLog/amachanic/articles/NumbersTable.aspx

How come you are bothered about the efficiency if you are creating a static table?

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2007-07-18 : 06:30:47

Thanks for the link pootle.

(1)I am interested because I have an unhealthy obsession with performance, and
(2)because I've yet to be able to 'sell' the static numbers table idea yet, and the code I am looking at is creating 2 'numbers tables' in table variables, so if I can't win the fight on a static table, I want the table variable created version to be as performant as possible.

*##* *##* *##* *##*
Chaos, Disorder and Panic ... my work is done here!

pootle_flump

1064 Posts

Posted - 2007-07-18 : 07:15:11

Ah - no it won't touch Michael's for efficiency.

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-07-18 : 11:36:37

quote:
Originally posted by Wanderer

Difficult to prove for definitely atm, but I am sure that Michaels is better. generating query plans is problematic for comparison, due to the number of plans generated by iteratives - Michaels shows 99% of all execution, but runs in <1 second, whereas other version takes <2 seconds.

I will need to consider the support side - Michaels is significantly more complex, but I think the end answer will (hopefully) be the best performing solution.

anyone got anything better?

*##* *##* *##* *##*
Chaos, Disorder and Panic ... my work is done here!

I am not sure why you would consider support an issue for the F_TABLE_NUMBER_RANGE function. It may look a little complex, but in essence, it's just a simple cross join.

I did a significant amount of testing with it before posting it here. It has been working fine for years where I work, and I have never seen an issue reported with it, either at work or on this site.

I did quite a bit of work on tuning and testing it for maximum performance, so if you find something faster, please post it.

I ran testing to compare it to a static table, and found it be slightly slower than a static table, but not by that much, and in certain situations it was faster. I decided not to implement a static table, because the function is much more convenient, and I often use it for large tests with a million rows or more. Since the function can be used to generate up to 16,777,216 rows, it is handier for this type of work.

CODO ERGO SUM

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2007-07-19 : 08:45:16

Hi Michael,

thanks for the response. I agree that the query may look more complicated than it is. I guess I rushed my previous response, based on the reasoning that, as a consultant, I don't want to be guilty of implementing something that the client is later finds it difficult (or is unable) to maintain - I need to balance performance against maintainability, and I've been guilty, in the past, of being too biased to performance, and not enough on maintainability.

Finally, I wanted to try and get the fastest way possible, simply because that is they way I learn, so thatnks again for the feedback.

*##* *##* *##* *##*
Chaos, Disorder and Panic ... my work is done here!

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources