Entries in Index tree - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Administration (2000)

Entries in Index tree

Author

Topic

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-11-22 : 10:00:06

I have a table with 11.7 million records.
The primary key is composited of four columns

1) CompanyID, nvarchar(10)
2) InvoiceID, nvarchar(20)
3) InvoiceDate, datetime
4) InvoiceLine, int

and there are 12 more columns in the table.

How many entries in the index tree are there?
I am wondering because any search is very slow compared to other machines I have encountered.

Peter Larsson
Helsingborg, Sweden

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-11-22 : 10:07:03

all 4 columns of data are stored in PK's CI.

Go with the flow & have fun! Else fight the flow

blog thingie: http://weblogs.sqlteam.com/mladenp

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-11-22 : 10:35:17

Now I am 10 USD richer!

I had a discussion with some other consultant about this and he persisted that there were only 250 000 entries in the index tree for a 4-column composite primary key with 11.7 million rows.

Peter Larsson
Helsingborg, Sweden

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-11-22 : 10:43:53

Go with the flow & have fun! Else fight the flow

blog thingie: http://weblogs.sqlteam.com/mladenp

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-11-22 : 10:44:54

i must ask though
how did he come up with that number?
what don't we know?

Go with the flow & have fun! Else fight the flow

blog thingie: http://weblogs.sqlteam.com/mladenp

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-11-22 : 10:48:47

The funny part is that he works for another consulting firm than me (I?), and has worked at this client's site for well over 10 years.
I started here in september and everything he does, I have to do all over again one hour later.

The sad part is that no one question him and he makes about 160 USD per hour...
The tragic part is that he has no clue what he does!

Peter Larsson
Helsingborg, Sweden

snSQL
Master Smack Fu Yak Hacker

1837 Posts

Posted - 2006-11-22 : 10:50:25

If the index is a clustered index then the index tree does not have an entry for every row, but it does have entries for all four columns of the rows that it does have. In a clustered index the leaf level pages (the pages that contain an index entry for every row) are the data pages of the table.

Having said that - SQL Server uses the clustered index key to look up rows for non-clustered indexes, so the fact that your primary key is so wide (up to 42 bytes per row) means that you are getting a copy of that index key in every row of every non-clustered index on the table and that is very likley causing performance problems. You should consider adding a surrogate PK so that all the non-clustered indexes and joins to this table can use a small efficient integer key. It would probably also make inserts into the table faster.

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-11-22 : 10:58:32

Thanks for the update!

However, it is not possible at this stage to change the primary key. All logic he has built over the last ten years relies on this primary key.

There are 25 companies. Dates span from January 1, 2004 to today (almost 36 months).
There are approx 90 invoices per company and day and every invoice has approx 2 to 6 lines.

Peter Larsson
Helsingborg, Sweden

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-11-22 : 11:03:57

so for how many rows does it have entries then, snSQL?

Go with the flow & have fun! Else fight the flow

blog thingie: http://weblogs.sqlteam.com/mladenp

snSQL
Master Smack Fu Yak Hacker

1837 Posts

Posted - 2006-11-22 : 11:15:41

quote:
However, it is not possible at this stage to change the primary key.

Does the table have non-clustered indexes? If so, you'd still almost certainly benefit from adding an identity column and making it the clustered index key, and then create a unique index for the PK. So you change the clustered index but not the PK.

quote:
so for how many rows does it have entries then, snSQL?

That depends on the data, but you can get an idea from DBCC SHOWCONTIG and DBCC SHOW_STATISTICS.

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-11-22 : 11:20:01

quote:
Originally posted by spirit1

i must ask though
how did he come up with that number?
what don't we know?

He wrote some stuff on paper and made analogies to AS/400 and DB2 the whole time. Very pedantic.

I could not get the picture to fit since the primary key also is the only index in the table. It is clustered.
The database engine is 8.0.818.

Peter Larsson
Helsingborg, Sweden

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-11-22 : 11:28:06

DBCC SHOWCONTIG

Table: 'tblSalesline' (843150049); index ID: 1, database ID: 7
TABLE level scan performed.
- Pages Scanned................................: 669231
- Extents Scanned..............................: 84436
- Extent Switches..............................: 608677
- Avg. Pages per Extent........................: 7.9
- Scan Density [Best Count:Actual Count].......: 13.74% [83654:608678]
- Logical Scan Fragmentation ..................: 45.60%
- Extent Scan Fragmentation ...................: 0.08%
- Avg. Bytes Free per Page.....................: 4056.7
- Avg. Page Density (full).....................: 49.88%

DBCC SHOW_STATISTICS

All density	Average Length	Columns
0,05555556	 4,19707	CompanyID
5,885604E-07	17,19108	CompanyID, InvoiceID
5,885604E-07	25,19108	CompanyID, InvoiceID, InvoiceDate
8,786414E-08	29,19108	CompanyID, InvoiceID, InvoiceDate, InvoiceLine

RANGE_HI_KEY	RANGE_ROWS	EQ_ROWS	DISTINCT_RANGE_ROWS	AVG_RANGE_ROWS
30		0		49073	0			0
31		0		18678	0			0
34		0		346	0			0
44		0		109196	0			0
45		0		250096	0			0
46ADJ		0		132647	0			0
46K		0		379996	0			0
46P		0		210865	0			0
46V		0		132647	0			0
54		0		207948	0			0
56		0		759297	0			0
61		0		1164063	0			0
64		0		734335	0			0
71		0		158839	0			0
74		0		4646342	0			0
78		0		191142	0			0
79		0		1084659	0			0
82		0		1151038	0			0

Peter Larsson
Helsingborg, Sweden

snSQL
Master Smack Fu Yak Hacker

1837 Posts

Posted - 2006-11-22 : 11:31:31

Imagine your table rows being entries in an address book, with tabs for the letters of the alphabet and only one page per letter. Now you'd have say 300 names and addresses in your address book, but you could get to all the Ms by using the M tab. So you have 26 tabs for 300 addresses. Similar thing for a SQL Server table. If there are 11.5 million rows and say 80 rows per page (that's about a 100 byte row, because pages are 8K) then all the index "tabs" (entries in the index non-leaf level pages) only have to point to 146250 pages.

I'm over simplifying but that gives the general idea. See Inside SQL Server for much more detail.

snSQL
Master Smack Fu Yak Hacker

1837 Posts

Posted - 2006-11-22 : 11:34:28

Looking at your DBCC output, I can tell you right away that rebuilding the index will greatly improve performance. Right now a table scan is having to perform 608678 reads instead of only 83654, purely because the index is so fragmented.
And you should rebuild that index fairly regularly, probably using a fillfactor of about 75%, but you can vary that to see what works best.

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-11-22 : 11:36:49

Peso:
wow!
now that's what i call a fragmented table.

snSQL:
hey thanx for that explanation. learned something new.

Go with the flow & have fun! Else fight the flow

blog thingie: http://weblogs.sqlteam.com/mladenp

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-11-22 : 11:41:27

I started to rebuild the index. That's what started the whole discussion. I was forced to abort by the other consultant since there are imports made every 5 minutes to this table from a number of sources.

There will never be time to rebuild the index, because doing so makes the SQL Server so busy that the imports fail.

Peter Larsson
Helsingborg, Sweden

snSQL
Master Smack Fu Yak Hacker

1837 Posts

Posted - 2006-11-22 : 11:46:11

Well, what do they want, poor overall performance, or good overall performance and the need to create a maintenance window every couple of weeks?
You need SQL Server 2005 - full online index rebuilds.

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-11-22 : 11:48:18

run
DBCC INDEXDEFRAG

this will be beneficial

as it is now you might as well drop that index....

Go with the flow & have fun! Else fight the flow

blog thingie: http://weblogs.sqlteam.com/mladenp

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-11-22 : 11:55:57

Anyhow, I get paid by the hour

as database developer, not administrator...

Peter Larsson
Helsingborg, Sweden

Kristen
Test

22859 Posts

Posted - 2006-11-22 : 13:09:34

Bit worrying that the DB don't have scheduled maintenance that would keep that (and other) indexes Defragged.

INDEXDEFRAG and not REINDEX - except on very small tables (up to around 1,000 rows IME)

After INDEXDEFRAG you need Update Statistics (REINDEX does that by itself, INDEXDEFRAG doesn't)

On a large table (probably more so with the multi-part key) you probably need FULLSCAN on the Update Stats.

"1) CompanyID, nvarchar(10)
2) InvoiceID, nvarchar(20)"

Are there any rows that actually use (or could use) UNICODE characters? Changing those to straight VARCHAR would increase the number of index entries per page.

Kristen

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2006-11-22 : 14:06:08

Seems like a very poorly designed index/table.

The use of nvarchar, instead of integer for CompanyID,
InvoiceID, and InvoiceLine seems like a very bad choice.

Another issue is the physical order of the table. With CompanyID at the highest level of the index; it forces the table to grow at many points as data is added in the middle, and results in the extreme fragmentation and low scan density. As Kristen suggested, it needs a regular INDEXDEFRAG and Update Statistics. The first one may take quite some time. After that, it may need it quite often, weekly or more often. I would increase the page density to about 90% if possible. It is going it fragment rapidly anyway.

Is InvoiceDate even needed in the PK? Shouldn't an invoice have only one date? Shouldn't InvoiceDate be a non-identifying attribute of the Invoice?

Is this some kind of table used for reporting? It appears to be at least partially denormalized. What is it actually used for?

Sounds like the other consultant is an idiot. Can’t offer any help there.

CODO ERGO SUM

Next Page

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources