Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-09-27 : 11:01:04
|
| I have checkboxes in an application. They are used in indexes. I started adding them as bit-type columns, but they aren't available in indexes. Should I use tinyint instead? |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2005-09-27 : 11:24:29
|
don't put indexes on. bit types and only 1,0 int aren't selective enough to use indexes effciently.Go with the flow & have fun! Else fight the flow |
 |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-09-27 : 11:58:18
|
| Here's an example: I have checkboxes Active, Open, and Urgent. In the DB I use now, I have an index on:Open descendingActive descendingUrgent descendingDate/Time descendingI want to get the most recent active, open, and urgent records. I use the index both for filtering and sorting. How would you structure this? From what I can see, I'd do best to have these checkbox values stored as tinyints. |
 |
|
|
DustinMichaels
Constraint Violating Yak Guru
464 Posts |
Posted - 2005-09-27 : 12:13:51
|
| Are you trying to create a covering index?If you you will have to create it in query anaylzer.CREATE INDEX <SomeIndexName> ON <SomeTable> (<SomeColumn>, <SomeColumn>, ...) |
 |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-09-27 : 12:37:21
|
quote: Originally posted by DustinMichaels Are you trying to create a covering index?
No, I need to get more columns than are in the index. I gave that example because using bit data type won't allow me to create that type of index at all. I am trying to figure out whether I should use the tinyint type instead. |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2005-09-27 : 13:18:01
|
| The reason "bit" cannot be indexed is because it only contains 1 or 0.There is no point having an index for data which is not reasonable selective. An index that has 100% selectivity means that every entry is unique. That makes an index really useful. If an index has 50% = 1 and 50% = 0 it has very low selectivity, and has no real use - it is faster for SQL Server to work from the actual records.Now, if you have an index with SomeBitColumn AND SomeDataColumn then you can do:SELECT SomeDataColumnFROM MyTableWHERE SomeBitColumn = 1and maybe SQL can use the index because it can get the SomeDataColumn from the index instead of having to do to the actual record data. That's called a covering index.(And in the examples I've shown you would have to use a different datatype, as you cannot use BIT in an index, but the fact remains if 50% of your index data is 1 and 50% = 0 and there are no other values for that column then the index will be worthless - unless I'm missing something profound!)Kristen |
 |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2005-09-27 : 13:25:56
|
don't think he read my post...Go with the flow & have fun! Else fight the flow |
 |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-09-27 : 13:44:11
|
quote: Originally posted by Kristen The reason "bit" cannot be indexed is because it only contains 1 or 0.
My table has several 100 thousand records, but there may be only 10,000 active records. Indexing the Active checkbox will help zoom in to those 10,000 records, won't it? Of course, if I look for inactive records, this index will not help.Taking what you said about selectivity, if each checkbox reduced the data to just 50%, having 3 of them in an index would reduce the data to 12.5% on average. That's significant savings in a large table.I have no experience with SQL Server and perhaps it works differently in this respect. I'll have to play with queries after I populate the data. Thanks for everybody's prompt responses. |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2005-09-27 : 15:02:43
|
Well ... I'll have a go at explaining WHY indexes like this are not usually helpful in performance terms:"Indexing the Active checkbox will help zoom in to those 10,000 records, won't it"No.An index only works when the optimiser decides that it is more efficient than looking through the "raw" records.Consider that the optimiser has a choice:1) Look up the value in the index - estimate how many rows that represents, say 50% of the database. OK. So that's "Do an index search for the first entry in the index" - lets give that cost 50 units. Now you've got the first entry, use the index entry to get the physical record - call that cost 5 units (you are bouncing around the data records in no sequential order, and its several disk accesses to get a single row)Now we've got the record. Perform NextIndexEntry - that's pretty quick - call that cost 2 units. Now use the index entry to get the physical record - the cost of that is 5 units, as before.Estimate there are 10,000 entries:So the first record is cost 50 + 5Each next records is cost 2 + 5 - there are 9,999 of them, that's a total "cost" of 59,994So Grand Total for using the index of 59,994 + 55 = 60,0492) Now lets estimate the cost based on just traversing the records in the database:There are 20,000 rows in the database. Getting the first row is cost = 5 units. Traversing to the next row in the database is cost = 1 unit - its right there in the same Disk Page in memory - when we need to move to another physical page we will do another physical disk read, but its sequential so pretty quick (SQL Server will have forced a disk read before we ACTUALLY need the data, so it will be in memory), so the average still works out around 1 unit.First row cost = 5 units.Traversing 19,999 rows = 19,999 x 1 units = 19,999 units.Total cost of accessing the data direct = 20,004 units.That's 1/3rd of the "cost" of using the index - so the index doesn't get used."Taking what you said about selectivity, if each checkbox reduced the data to just 50%, having 3 of them in an index would reduce the data to 12.5% on average"There are only 2 values a BIT can store - 1 or 0. That's an average of 50% for each of those values.If you combine three BIT columns in one database that is 1/2 * 1/2 * 1/2 = 12.5% - still not great.So we now have an composite index made up of BitColumn1 + BitColumn2 + BitColumn3.If we make a query:SELECT *FROM MyTableWHERE BitColumn1 = 1 AND BitColumn2 = 0 AND BitColumn3 = 1 there is an opportunity to use the composite index. However, selectively is still only 1/2 * 1/2 * 1/2 - i.e. 1-in-8 - not highly unique.But lets assume SQL Server makes use of your index. Given that there are three BIT value columns in the index, of 2 possible values each, then the likelihood is (assuming equal dispersion in the index) that 1/8th of the values will match and 1/2 of the index will have to be read (i.e. the range covered by BitColumn1 - so SQL will read 1/2 of the index to get 1/8th of the values - I can tell you that in SQL's terms that is not a horse it would put a bet on!)10,000 index rows will be read of which 2,500 are actually the ones we need.So we havefirst record is cost 50 + 5Each next records is cost 2, plus 5 if the index entry "matches"So we have 9,999 index entries to process at 2 units each, plus 2,500 "good ones" at 5 units eachTotal = (50) + (9,999 * 2) + (2,500 * 5)That's still more than accessing the data direct = 20,004 units - the index is still not going to be used IMHOThen you do a query of:SELECT *FROM MyTableWHERE BitColumn2 = 0 AND BitColumn3 = 1 now the index is less useful - the first column in the index is BitColumn1, which is not part of the WHERE clause - so to discover whether any index entries are useful the WHOLE index will have to be read (because the start of the index value (BitColumn1) is useless). From the whole index 1/4tr of the index will match, again lousy selectivity - to get 1/4tr of the rows we have to a) read an index entry and b) decode that and look up the physical record. It just is not cost effective.Kristen |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2005-09-27 : 15:04:11
|
"don't think he read my post..."Maybe .. but maybe its actually "didn't understand the significance of your post" - this is "New to SQL Server" after all Kristen |
 |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-09-27 : 15:53:46
|
| Thank you for such an exhaustive explanation of how SQL Server works with indexes. I know now what I have to study before I can make any assumptions in this area. One more question (pardon my ignorance):In a query, is there a way to force SQL Server to use a particular index? |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2005-09-28 : 00:04:16
|
| "is there a way to force SQL Server to use a particular index"You can provide hints which will achieve this, but it is very rare to do so because SQL Server makes the right choice (in the main!)It is better to examine which indexes etc. SQL Server is actually using for a queruy, scratch your head as to why! and then change the query, or try creating other indexes etc. and seeing how that effects the queryKristen |
 |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-09-28 : 15:04:45
|
| Thank you Kristen, I'll do that. |
 |
|
|
|
|
|
|
|