| Author |
Topic |
|
RobVG
Starting Member
42 Posts |
Posted - 2004-07-09 : 11:41:42
|
| Just wondering how SQL 'sorts' duplicates in a non-unique index. Are the keys entered in the order the records were inserted? |
|
|
SamC
White Water Yakist
3467 Posts |
Posted - 2004-07-09 : 12:40:45
|
| SQL doesn't inherently sort records.The value of the Primary Key IDENTITY has a starting value and incriment you can specify. |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2004-07-09 : 12:45:52
|
SQL doesn't sort by using an index, it just uses it to make less work when finding things. (Bit of a sweeping statement, I expect it does use the index to retrieve things in a given order IF the index matches the ORDER BY prescribed for a given query, thus avoding the need for a sort. But if you want the answers in a particular order you'll need to add an ORDER BY).We try to remember to add an ORDER BY using the primary key fields - i.e. if we are sorting customers by, say, Last Order Date we add the CustomerID to the end as a tie break; this is just to ensure that the results will always come back in the same order just in case some external process [rightly or wrongly] becomes dependant on that ordering - but I need the message count!Kristen |
 |
|
|
RobVG
Starting Member
42 Posts |
Posted - 2004-07-09 : 13:48:22
|
| Sorry, I probably have some misconceptions about indexes.What I'm trying to say is, aren't the Index values (Bookmarks- PK's or RID's) within a Nonclustered index sequenced in such a way that if you run down the list of Bookmarks they will point to corresponding record on the table in alpha/numeric order of the indexed column. If you have duplicates or nulls, are the keys just grouped together in no particular sequence?Say you have a table with 4 str values- PK Fname-- --1 Ernie2 Bert3 Oscar4 KermitWouldn't a nonclustered index on Fname look like this?2143If the column allowed nulls and the Index was not unique the table might look like thisPK Fname-- --1 Ernie2 NULL3 NULL4 Bert5 NULL6 Oscar7 NULL8 Kermit9 NULLWhat would the Index key look like now?-4186 2 ---------- are NULL's grouped at the end? beginning? or...3579 ----------4182--------- ...is NULL treated like a literal?3579---------6(Sorry, visual learner.) |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2004-07-09 : 14:06:22
|
| > If you have duplicates or nulls, are the keys just grouped together in no particular sequence?Yes, but I need to understand why this is important to you, 'coz I feel like I'm not answering the question!From where I stand! the way I see it is:I can ask SQL to find all customers in the UK. If there is an index on CountryCode then SQL will be fast, if not it will scan through all the individual records.That's it! I can't interogate the index for any additional benefit (well if my query is "covered" by the Index SQL won't bother going to the records at all, but I don't think that's your question either ...)Kristen |
 |
|
|
RobVG
Starting Member
42 Posts |
Posted - 2004-07-09 : 14:19:16
|
| I'm a student just trying to understand index structure.I want to know how indexes get built/fragmented- rebuilt and defragged and what are the best ways to maintain them.I'm going blind reading BOL. |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2004-07-09 : 14:30:56
|
| Ah, NOW I've got it. I imagine it a binary-tree type deal. Lets say you can get 50 index entries on a page. So you start adding them and you get to 51, so you split the 51 into two lots and allocate a new page. You link the old & new pages so you can "walk" through them in order.You may split the original page into 26/25 or 50/1 (the first is good for random data, the second for sequential data).If you've got a duplicate key to store it makes sense to store it where it's most efficient to do so. If you've got 51 "AAAAAAAA" keys you'd be better off storing it in the second page, that is mostly empty, rather than the first page. So that would be out of order, from a human persepective. But it would all depend on what the developers decided was the best strategy at that point.Then something comes along and deletes a whole bunch or rows and you've got lots of pages with hardly anything in them. They are ready to fill up again, but would it be better to "compact" the data? SQL is having to read loads of pages and only finding one or two [say] entries on each page, that's wasting loads of time.So attack with the DBCC command to defragment the index. Or tick the "Auto shrink" option in Enterprise Manager.Dunno if that helps, I've probably made it too simplistic and you are needed a more technical brief?Kristen |
 |
|
|
RobVG
Starting Member
42 Posts |
Posted - 2004-07-09 : 14:49:28
|
| Thanks Kristen, it's starting to sink in.One last question if If I may. A Fill factor that is a percent of a full page is only applied when the index is created to reserve room for growth- correct? So over time if an index page does becomes full it will split 50/50 because the fill factor has defaulted back to 0?By the way, I just got an email from the author of this articlehttp://www.sql-server-performance.com/rd_index_fragmentation.aspGood article on Idex structure but he will be changing the '7' in the 'New Index structure' example to a '6' :-)The microsoft SQL 2000 D&I Training Kit is so full of errors it's not even funny. How are we supposed to learn... |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2004-07-10 : 02:05:22
|
| Fill factor is reapplied when the index is rebuilt (DBCC DBREINDEX).I have no idea whether a full page is split 50/50 or something else (but the fill factor is NOT used for that decision).For example, seems to me it would be pretty daft to do a 50/50 split on an index for an IDENTITY value which is an incrementing value.Kristen |
 |
|
|
RobVG
Starting Member
42 Posts |
Posted - 2004-07-13 : 12:06:51
|
For what it's worth, I found a few enteries in BOL that helped clarify things for me. quote: Fill Factor: The fill factor is implemented only when the index is created; it is not maintained after the index is created as data is added, deleted, or updated in the table. Trying to maintain the extra space on the data pages would defeat the purpose of originally using the fill factor because SQL Server would have to perform page splits to maintain the percentage of free space, specified by the fill factor, on each page as data is entered. Therefore, if the data in the table is significantly modified and new data added, the empty space in the data pages can fill. In this situation, the index can be re-created and the fill factor specified again to redistribute the data.
quote: Non-clustered Indexes/architecture: If the table does have a clustered index, or the index is on an indexed view, the row locator is the clustered index key for the row. If the clustered index is not a unique index, SQL Server 2000 makes duplicate keys unique by adding an internally generated value. This value is not visible to users; it is used to make the key unique for use in nonclustered indexes.
I also didn't know that BOL was updated this year.http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp |
 |
|
|
Kristen
Test
22859 Posts |
|
|
RobVG
Starting Member
42 Posts |
Posted - 2004-07-13 : 14:37:25
|
| Yah but mine is the one on the 120 day Evaluation edition that comes with the MS SQL Server 2000 D&I training kit that is still being used (at least at the school I went to.) |
 |
|
|
RobVG
Starting Member
42 Posts |
Posted - 2004-07-13 : 19:15:52
|
| Sorry Kristen, just noticed your post was a question.The BOL download from that link just says (Updated) in the title bar. |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2004-07-14 : 02:43:58
|
| Drat, might be newer then :-(MyC:\Program Files\Microsoft SQL Server\80\Tools\Books\sql80EN.col & .chware dated 05-Jun-2004. There is no version metadata in the file (why did MS introduce that ability and not use it?)Anyone know if that is the latest before I download and rollout a new verison needlessly ...And where are the consistency police when I need them? "(Updated)" and "(Updated 2004)" and even "(Updated SP3)" indeed :-(Kristen |
 |
|
|
|