Author |
Topic |
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-11-22 : 10:00:06
|
I have a table with 11.7 million records.The primary key is composited of four columns1) CompanyID, nvarchar(10)2) InvoiceID, nvarchar(20)3) InvoiceDate, datetime4) InvoiceLine, intand there are 12 more columns in the table.How many entries in the index tree are there?I am wondering because any search is very slow compared to other machines I have encountered.Peter LarssonHelsingborg, Sweden |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-11-22 : 10:07:03
|
all 4 columns of data are stored in PK's CI.Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-11-22 : 10:35:17
|
Now I am 10 USD richer!I had a discussion with some other consultant about this and he persisted that there were only 250 000 entries in the index tree for a 4-column composite primary key with 11.7 million rows.Peter LarssonHelsingborg, Sweden |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-11-22 : 10:43:53
|
  Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-11-22 : 10:44:54
|
i must ask thoughhow did he come up with that number?what don't we know?Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-11-22 : 10:48:47
|
The funny part is that he works for another consulting firm than me (I?), and has worked at this client's site for well over 10 years.I started here in september and everything he does, I have to do all over again one hour later. The sad part is that no one question him and he makes about 160 USD per hour...The tragic part is that he has no clue what he does!Peter LarssonHelsingborg, Sweden |
 |
|
snSQL
Master Smack Fu Yak Hacker
1837 Posts |
Posted - 2006-11-22 : 10:50:25
|
If the index is a clustered index then the index tree does not have an entry for every row, but it does have entries for all four columns of the rows that it does have. In a clustered index the leaf level pages (the pages that contain an index entry for every row) are the data pages of the table.Having said that - SQL Server uses the clustered index key to look up rows for non-clustered indexes, so the fact that your primary key is so wide (up to 42 bytes per row) means that you are getting a copy of that index key in every row of every non-clustered index on the table and that is very likley causing performance problems. You should consider adding a surrogate PK so that all the non-clustered indexes and joins to this table can use a small efficient integer key. It would probably also make inserts into the table faster. |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-11-22 : 10:58:32
|
Thanks for the update!However, it is not possible at this stage to change the primary key. All logic he has built over the last ten years relies on this primary key.There are 25 companies. Dates span from January 1, 2004 to today (almost 36 months).There are approx 90 invoices per company and day and every invoice has approx 2 to 6 lines.Peter LarssonHelsingborg, Sweden |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-11-22 : 11:03:57
|
so for how many rows does it have entries then, snSQL?Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
snSQL
Master Smack Fu Yak Hacker
1837 Posts |
Posted - 2006-11-22 : 11:15:41
|
quote: However, it is not possible at this stage to change the primary key.
Does the table have non-clustered indexes? If so, you'd still almost certainly benefit from adding an identity column and making it the clustered index key, and then create a unique index for the PK. So you change the clustered index but not the PK.quote: so for how many rows does it have entries then, snSQL?
That depends on the data, but you can get an idea from DBCC SHOWCONTIG and DBCC SHOW_STATISTICS. |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-11-22 : 11:20:01
|
quote: Originally posted by spirit1 i must ask thoughhow did he come up with that number?what don't we know?
He wrote some stuff on paper and made analogies to AS/400 and DB2 the whole time. Very pedantic.I could not get the picture to fit since the primary key also is the only index in the table. It is clustered.The database engine is 8.0.818.Peter LarssonHelsingborg, Sweden |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-11-22 : 11:28:06
|
DBCC SHOWCONTIG Table: 'tblSalesline' (843150049); index ID: 1, database ID: 7TABLE level scan performed.- Pages Scanned................................: 669231- Extents Scanned..............................: 84436- Extent Switches..............................: 608677- Avg. Pages per Extent........................: 7.9- Scan Density [Best Count:Actual Count].......: 13.74% [83654:608678]- Logical Scan Fragmentation ..................: 45.60%- Extent Scan Fragmentation ...................: 0.08%- Avg. Bytes Free per Page.....................: 4056.7- Avg. Page Density (full).....................: 49.88% DBCC SHOW_STATISTICSAll density Average Length Columns0,05555556 4,19707 CompanyID5,885604E-07 17,19108 CompanyID, InvoiceID5,885604E-07 25,19108 CompanyID, InvoiceID, InvoiceDate8,786414E-08 29,19108 CompanyID, InvoiceID, InvoiceDate, InvoiceLine RANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS AVG_RANGE_ROWS30 0 49073 0 031 0 18678 0 034 0 346 0 044 0 109196 0 045 0 250096 0 046ADJ 0 132647 0 046K 0 379996 0 046P 0 210865 0 046V 0 132647 0 054 0 207948 0 056 0 759297 0 061 0 1164063 0 064 0 734335 0 071 0 158839 0 074 0 4646342 0 078 0 191142 0 079 0 1084659 0 082 0 1151038 0 0 Peter LarssonHelsingborg, Sweden |
 |
|
snSQL
Master Smack Fu Yak Hacker
1837 Posts |
Posted - 2006-11-22 : 11:31:31
|
Imagine your table rows being entries in an address book, with tabs for the letters of the alphabet and only one page per letter. Now you'd have say 300 names and addresses in your address book, but you could get to all the Ms by using the M tab. So you have 26 tabs for 300 addresses. Similar thing for a SQL Server table. If there are 11.5 million rows and say 80 rows per page (that's about a 100 byte row, because pages are 8K) then all the index "tabs" (entries in the index non-leaf level pages) only have to point to 146250 pages.I'm over simplifying but that gives the general idea. See Inside SQL Server for much more detail. |
 |
|
snSQL
Master Smack Fu Yak Hacker
1837 Posts |
Posted - 2006-11-22 : 11:34:28
|
Looking at your DBCC output, I can tell you right away that rebuilding the index will greatly improve performance. Right now a table scan is having to perform 608678 reads instead of only 83654, purely because the index is so fragmented.And you should rebuild that index fairly regularly, probably using a fillfactor of about 75%, but you can vary that to see what works best. |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-11-22 : 11:36:49
|
Peso:wow!now that's what i call a fragmented table.snSQL:hey thanx for that explanation. learned something new.Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-11-22 : 11:41:27
|
I started to rebuild the index. That's what started the whole discussion. I was forced to abort by the other consultant since there are imports made every 5 minutes to this table from a number of sources.There will never be time to rebuild the index, because doing so makes the SQL Server so busy that the imports fail.Peter LarssonHelsingborg, Sweden |
 |
|
snSQL
Master Smack Fu Yak Hacker
1837 Posts |
Posted - 2006-11-22 : 11:46:11
|
Well, what do they want, poor overall performance, or good overall performance and the need to create a maintenance window every couple of weeks?You need SQL Server 2005 - full online index rebuilds. |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-11-22 : 11:48:18
|
run DBCC INDEXDEFRAG this will be beneficial as it is now you might as well drop that index....Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-11-22 : 11:55:57
|
Anyhow, I get paid by the hour as database developer, not administrator...Peter LarssonHelsingborg, Sweden |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-11-22 : 13:09:34
|
Bit worrying that the DB don't have scheduled maintenance that would keep that (and other) indexes Defragged.INDEXDEFRAG and not REINDEX - except on very small tables (up to around 1,000 rows IME)After INDEXDEFRAG you need Update Statistics (REINDEX does that by itself, INDEXDEFRAG doesn't)On a large table (probably more so with the multi-part key) you probably need FULLSCAN on the Update Stats."1) CompanyID, nvarchar(10)2) InvoiceID, nvarchar(20)"Are there any rows that actually use (or could use) UNICODE characters? Changing those to straight VARCHAR would increase the number of index entries per page.Kristen |
 |
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2006-11-22 : 14:06:08
|
Seems like a very poorly designed index/table.The use of nvarchar, instead of integer for CompanyID,InvoiceID, and InvoiceLine seems like a very bad choice.Another issue is the physical order of the table. With CompanyID at the highest level of the index; it forces the table to grow at many points as data is added in the middle, and results in the extreme fragmentation and low scan density. As Kristen suggested, it needs a regular INDEXDEFRAG and Update Statistics. The first one may take quite some time. After that, it may need it quite often, weekly or more often. I would increase the page density to about 90% if possible. It is going it fragment rapidly anyway.Is InvoiceDate even needed in the PK? Shouldn't an invoice have only one date? Shouldn't InvoiceDate be a non-identifying attribute of the Invoice?Is this some kind of table used for reporting? It appears to be at least partially denormalized. What is it actually used for?Sounds like the other consultant is an idiot. Can’t offer any help there.CODO ERGO SUM |
 |
|
Next Page
|