SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 bitmap column
 New Topic  Reply to Topic
 Printer Friendly
Previous Page
Author Previous Topic Topic Next Topic
Page: of 2

blindman
Flowing Fount of Yak Knowledge

USA
2365 Posts

Posted - 01/05/2006 :  12:05:27  Show Profile  Reply with Quote
...also, bitwise operations are pretty fast, and I'd bet the performance of this statement:

SELECT	*
FROM	MyTable
WHERE	(MyBitMask & 57) = 57

...would compare favorably with the performance of a logical WHERE clause such as this:

SELECT	*
FROM	MyTable
WHERE	Rainy = 1
	AND Cloudy = 1
	AND Dark = 1
	AND Windy = 1
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

USA
7020 Posts

Posted - 01/05/2006 :  14:36:43  Show Profile  Reply with Quote
We can trade posts about bitmap indexes and so on all day, so I did some testing. Feel free to do you own.

Here is the script I used. I got clustered index scans on the two lookup queries at the end, so it was not using the index.


create  table NUM
(
N	int not null primary key clustered,
RND	int not null
)
go
/*
Load table NUM with a list of random integers
Takes a while to run because of the conversion
*/
insert into NUM
select
	N = number,
	-- Use newid to generate a random integer
	RND = abs(convert(int,convert(varbinary(40),newid())))
from
	-- Function available from Script Library
	dbo.F_TABLE_NUMBER_RANGE(1,100000)
go
create table TEST_BITMAP
(
N	int		not null
	primary key clustered,
BMP	tinyint		not null,
COL1	varchar(30)	not null,
COL2	varchar(30)	not null,
COL3	varchar(500)	not null,
)
go
-- Load data into table
insert into TEST_BITMAP
select
	N,
	-- BMP will contain a value in the range of 0 to 32
	BMP	= RND%33,
	COL1	= convert(varchar(30),RND),
	COL2	= reverse(convert(varchar(30),RND)),
	COL3	= replicate('123456789',50)
from
	NUM
order by
	N
go
-- Create index on bitmap column
create index IX_TEST_BITMAP_BMP ON TEST_BITMAP (BMP)
go
/*
Test with direct lookup on a single value
I got a clustered index scan
*/
select
	N,BMP,COL1,COL2, COL3
from
	TEST_BITMAP
where
	BMP IN (24)
go
/*
Test with bitmap 
I got a clustered index scan
*/
select
	N,BMP,COL1,COL2, COL3
from
	TEST_BITMAP
where
	(BMP & 24) = 24
go
drop table NUM
go
drop table TEST_BITMAP
go






CODO ERGO SUM
Go to Top of Page

blindman
Flowing Fount of Yak Knowledge

USA
2365 Posts

Posted - 01/05/2006 :  15:13:51  Show Profile  Reply with Quote
Of course, I get the same results using your script as written.

But a few points...
First, you stated that you doubted that SQL Server could effectively use the index. I consider a scan to be an effective use of an index. Not as effective as a seek, but definitely more effective than ignoring the index altogether, which is what the optimizer would likely do with an indexed bit column. As your code demonstrates, the optimizer does use the index to increase performance.

Second, you need to play around with the parameters a bit more if you are going to go to the trouble of creating all that sample code. Yeah, I get index scans when 100,000 records are populated with 32 possible values. But 32 values in a bitmask will only represent four distinct attributes. Add just one more attribute to your test (the poster listed five, anyway) and you bump your possible bitmask values up to 64.

And guess what? Then you get an index seek. Voila'!
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

USA
7020 Posts

Posted - 01/05/2006 :  18:18:01  Show Profile  Reply with Quote
quote:
Originally posted by blindman

Of course, I get the same results using your script as written.

But a few points...
First, you stated that you doubted that SQL Server could effectively use the index. I consider a scan to be an effective use of an index. Not as effective as a seek, but definitely more effective than ignoring the index altogether, which is what the optimizer would likely do with an indexed bit column. As your code demonstrates, the optimizer does use the index to increase performance.

Second, you need to play around with the parameters a bit more if you are going to go to the trouble of creating all that sample code. Yeah, I get index scans when 100,000 records are populated with 32 possible values. But 32 values in a bitmask will only represent four distinct attributes. Add just one more attribute to your test (the poster listed five, anyway) and you bump your possible bitmask values up to 64.

And guess what? Then you get an index seek. Voila'!



I'm not sure where you got 64, but 2 to the 5th power is 32 (2*2*2*2*2 = 32), and that's why I set the test up that way. I never said that an index with higher cardinality would not do an index seek.

The index scan was a clustered index scan, which is the same as a table scan on a table with a clustered index. It was not a scan on the index that I created in the script, so the index is not used.

I did not try to rig the test to keep it from using the index. In fact, I set it up so that the table was wide in order to make it more likely to use the index by reducing the rows per page to only 16.

To be fair, I re-tested with higher cardinality, and it will use an index for a index lookup in that case with the first query. I also retested with a smaller row size, about 130 bytes. In that case it would not use the index until I raised cardinality to 256. When I reduced it to about 60 bytes per row, I had to raise the cardinality to 512 to get it to use the index.

I could never get it to use the index for the lookup when using the second query with the bitmask, even when I made the cardinality so high that all rows were unique, so it doesn't look like it will ever use the index in that case.




CODO ERGO SUM
Go to Top of Page

blindman
Flowing Fount of Yak Knowledge

USA
2365 Posts

Posted - 01/05/2006 :  23:53:33  Show Profile  Reply with Quote
Thanks for doing more testing. Interesting results with changing the rowsize. Not surprised that the second query always scans.

By the way, 2^5 is 32, as you said.
2^4 is 16
2^3 is 8
2^2 is 4
and of course, 2^1 is 2
The bitmask must be able to store ALL combinations of these values. To store all the bits as true results in a bitmask of 2+4+8+16+32=64. And that's why you need to use 64 random numbers in your test to cover five attributes.
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

USA
7020 Posts

Posted - 01/06/2006 :  01:04:33  Show Profile  Reply with Quote
quote:
Originally posted by blindman

Thanks for doing more testing. Interesting results with changing the rowsize. Not surprised that the second query always scans.

By the way, 2^5 is 32, as you said.
2^4 is 16
2^3 is 8
2^2 is 4
and of course, 2^1 is 2
The bitmask must be able to store ALL combinations of these values. To store all the bits as true results in a bitmask of 2+4+8+16+32=64. And that's why you need to use 64 random numbers in your test to cover five attributes.


Actually, 2+4+8+16+32 = 62

In any case, with 5 bits, you can have only 32 possible values, 0 to 31.
1+2+4+8+16=31

Binary 00000 = Decimal 0
Binary 00001 = Decimal 1
Binary 00010 = Decimal 2
Binary 00100 = Decimal 4
Binary 01000 = Decimal 8
Binary 10000 = Decimal 16
Binary 11111 = Decimal 31

With the Windows Calculator: Enter 31, click the Bin radio button, and it will display 11111



"There are only 10 kinds of people in the world, those that understand binary and those that don't."








CODO ERGO SUM
Go to Top of Page

Kristen
Test

United Kingdom
22415 Posts

Posted - 01/06/2006 :  04:35:37  Show Profile  Reply with Quote
"You can't use aggregate functions on bit datatypes, so maybe that was what you were thinking about"

That was it - TaDa!.

"No argument on the "bits 4 and 5 and all others don't care" case. I think a scan would be unavoidable"

Just thinking on that a bit more (I expect I was being dense!!) two BIT columns are going to do a TABLE SCAN too ... so no worse off, and in the case of "bits 4 and 5 set to True and ALL OTHER BITS ZERO" an index may well be used, and thus perform better than BIT columns which are only ever going to SCAN.

MVJ: Very useful test script, ta.

Kristen
Go to Top of Page
Page: of 2 Previous Topic Topic Next Topic  
Previous Page
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.12 seconds. Powered By: Snitz Forums 2000