Author |
Topic |
TRACEYSQL
Aged Yak Warrior
594 Posts |
Posted - 2006-09-24 : 10:00:35
|
We have an application database ERP system thats is in use by the vendorHowever each of the tables does not have any Primary Keys...........Most of the tables have indexes.In order for us to replicate this data to our reporting database we have to put on Primary keys.One of the consultants from the vendor....done this part.....As we are upgrading i have been looking at what he did.............and see that Primary key does not have clustered index when viewing in EM.When i create a table with primary key in EM it puts clustered index = yes.The script that the consultant used has put primary key clustered index = no.Questions:1. Why not have clustered set to Yes on primary keys2. Why would this be no.3.. What the benefits of having clustered index vs no clustered index with the primary key.4. The primary key has been decided becaue he read the indexes and then create the primary key...so if there was an index then he set PK. So we have primary key .....same columns set and index the same set TABLE A COLUMN 1 COLUMN 2 Primary key and INDEX COLUMN1 COLUMN2 5. Should the sp_recompile 'tablename' be run ..........what happens if this is not done.....Any advice................... |
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2006-09-24 : 13:02:47
|
1. Why would you want to? A clustrered index has no relation to a PK - may be good or not but that depends on the columns indexed not the PK attribute.2. see 1.3. See 1. There's no connection.4. No point. The PK has an index to support it so you shouldn't inex the same columns with another index.5. Only if needed. It will age all query plans using that table (or it should).==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2006-09-24 : 13:13:03
|
quote: Originally posted by TRACEYSQL4. The primary key has been decided becaue he read the indexes and then create the primary key...so if there was an index then he set PK. So we have primary key .....same columns set and index the same set TABLE A COLUMN 1 COLUMN 2 Primary key and INDEX COLUMN1 COLUMN2
And that worked without having to delete any rows violating the new PK constraints?I'm moderately impressed, if a little suspicious.If none of the original indexes were clustered, I'd say the advantage of creating the PKs non-clustered was that he didn't have to hang around and wait for SQL Server to completely rebuild the tables from being heaps to b-trees. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-09-24 : 15:47:42
|
Do NOT duplicate an INDEX and a PRIMARY KEY. (SQL will have to insert new entries into both, wasting CPU time and disk space)I think you should have a Clustered Index on most, and maybe all, tables. However, deciding which index should be the clustered index does need some thought; oftentimes the PK is a good candidate, but not always.If you have NO Clustered Index all the indexes will have a direct pointer to the data. If you DO have a clustered index then all the OTHER indexes will include the PK instead of the data pointer.This is good and bad:The Secondary Index will also "cover" the PK columns - which may help some queries.The Secondary Indexes will be wider (unless the PK is the same width (or narrower, but that's hard in practice) as a datapointer)Secondary indexes need two lookups before you get to the record - but the second lookup is on the clustered index, and the data is right there ... so may not be too significant.Updates that "move" the record elsewhere on the disk only need to update the clustered index - the secondary indexes are pointing to it using the PK columns; without a clustered index ALL the Indexes need to be updated to reflect the new disk address - relevant if you have a high loevel of Updates v. Insert/Select/If the Clustered Index is NOT Unique then an extra tie-break number is added automatically - which creates wider keys and possible some inefficiency.So there could be significant performance implications (either way!) of adding clustered indexes, and that might be a reason why they choose to NOT add them - however duplicating an existing Index and a Primary key suggests that its more likely that they don't know Up from Down ...If you have an easy-ish way of testing performance across the whole of your application (e.g. an automated test suite) I would strong recommend trying Clustered Indexes on the most likely candidate in each table and seeing what impact that makes; it could be a significant improvement.Kristen |
 |
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2006-09-24 : 17:32:08
|
Is it a universal rule that all vendor supplied databases have such a poor design?I have yet to see a vendor database that appears to have been designed by someone with even a minimum level of competence. No primary keys, no foreign keys, no constraints, badly chosen data types, etc. seems to be more the rule than the exception,CODO ERGO SUM |
 |
|
TRACEYSQL
Aged Yak Warrior
594 Posts |
Posted - 2006-09-24 : 18:40:16
|
There was no pk at all....we had to put them on for sql replication.I have to read up.....i still i little confused on clustered and non clustered.Thanks all for replying ---yes this worked without having to delete any rows violating the new PK constraints?I have a test environment .....i can do some measuring....i just not sure why the differnce on clustered and non clustered ....Is it because like you say pk have nothing to do with clustered or non clustered..I know what primary key are and indexes ...just the clustered and not clustered part. |
 |
|
TRACEYSQL
Aged Yak Warrior
594 Posts |
Posted - 2006-09-24 : 20:24:56
|
i just been reading alot today....I think i understand the non clustered and clustered.Clustered (sort) ...and only one clustered index on a table.Can have up to 249 indexesClustered Index (B-Tree) where as non clustered uses pointers.So i look tomorrow and see if there are clustered indexes already thats probably why he did primary key with non clustered.Primary keys have nothing to do with Indexes...but make unique data.Clustered index if this is unique (is this the same as primary key) |
 |
|
snSQL
Master Smack Fu Yak Hacker
1837 Posts |
Posted - 2006-09-24 : 22:28:12
|
quote: Clustered index if this is unique (is this the same as primary key)
No. A primary key is the specific key that you are defining as the way to uniquely access each row on the table and is the basis for foreign key relationships from other tables.A primary key must have a unique index associated with it to maintain the uniqueness of the rows, but that index does not have to be the clustered index. Also, the index for the primary key does not necessarily have to be the only unique index on a table. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-09-25 : 01:32:52
|
"I think i understand the non clustered and clustered.Clustered (sort) ...and only one clustered index on a table."In case it helps, the clustered index organises the physical order that the data is stored on the disk.So if you want all records between 'A' and 'B', sorted in order, they are going to be in the right order on the disk, and therefore very fast to locate.Its a pity not to have one on each table - unless there really is no good candidate.Kristen |
 |
|
TRACEYSQL
Aged Yak Warrior
594 Posts |
Posted - 2006-09-25 : 10:59:03
|
Ok.....im looking at the databases we have database4 and database5database4 is our old one and database5 new converted..Both have primary keys.Now on 4 he has nonclustered for primary keyNow on 5 he has clustered for primary keyExampleEMPL PK_0041 Clustered Primary Constraint EMP_ID PI_0041 Non Clustered unique Index EMPL_IDThere are no other primary keys on any tables only what we added.So the question is why now using Clustered on Primary Key....There are no other clustered indexid of 1 except for the PK)So ....now there be B-TREES...How to build the B-TREES is there something i have to run...The PI_0041 (is what vendor given us) |
 |
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2006-09-25 : 11:49:44
|
Odd.Today I was doing a join between two tables on an id.The id is unique and has a clustered unique index and a non-clustered non-unique index on it (hadn't dropped the non-clustered index after de-duping yet).This table is very wide the other very narrow - both about 2 million rows.Doing a select * from the join the estimated plan had two nested loops - the first index seeking on the non-clustered index and the second getting the row using the clustered index.Using a hint gave a single nested loop using the clustered index but the cost of that was 86% compared to the 14% with two nested loops.I didn't try to run it as it would take too long but I probably will need to at some point.According to this it should be much faster by having a clustered and non-clustered index on the same column - can sort of see why it might happen but not really convinced.(statistics were up to date)==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2006-09-25 : 14:12:02
|
Nigel, you could post the DDL and DML.It could well be that the original indexes in the system were UNIQUE.Which could explain why there were no violations / errors.And as for the clustering, changing the clustering of a table that already has lots of data in it,is not something that one wants to do regularily.It will change the data retrieval/insert patterns in the system (hopefully for the better/but no guarantee),And will also lock up the table, fill up the log, take lots of time while you are performing the operation.I'm just saying that it's an operation that should be planned,and the clustering of a table should be chosen with care.rockmoose |
 |
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2006-09-25 : 14:29:53
|
quote: Originally posted by rockmoose...I'm just saying that it's an operation that should be planned,and the clustering of a table should be chosen with care...
Ideally, before going live, too. When you are doing a retrofit like this system, it gives you a lot fewer option to experiment to get it right.CODO ERGO SUM |
 |
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2006-09-25 : 14:42:55
|
quote: Originally posted by Michael Valentine Jones
quote: Originally posted by rockmoose...I'm just saying that it's an operation that should be planned,and the clustering of a table should be chosen with care...
Ideally, before going live, too. When you are doing a retrofit like this system, it gives you a lot fewer option to experiment to get it right.CODO ERGO SUM
Yes, we've retrofitted tables with 10's million rows.But first rather extensive testing had to be done to ensure that the new clustering and indexes would really improve the performance.And a careful data migration plan had to be followed to actually perform the re-clustering + re-indexing in production with no downtime.It worked very well, but without the testing there is no way we would have dared to go ahead and place our bets. |
 |
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2006-09-25 : 16:54:19
|
TRACEYSQL,Did you verify with the vendor that they would support your product if you modified their schema? Most 3rd party products specifically state in the contract that you are not to modify their schema without permission, else you'll forfeit support from them.Tara Kizer |
 |
|
byrmol
Shed Building SQL Farmer
1591 Posts |
Posted - 2006-09-25 : 17:14:50
|
quote: Did you verify with the vendor that they would support your product if you modified their schema? Most 3rd party products specifically state in the contract that you are not to modify their schema without permission, else you'll forfeit support from them.
Good point Tara!I am at the end of my project and it will become one of these "3rd party products".And by <INSERT DEITY>, anyone who changes the structure of the database faces a fate worse than death...THE LAWYERSDavidMProduction is just another testing cycle |
 |
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2006-09-26 : 04:52:31
|
quote: Originally posted by rockmoose Nigel, you could post the DDL and DML.It could well be that the original indexes in the system were UNIQUE.Which could explain why there were no violations / errors.And as for the clustering, changing the clustering of a table that already has lots of data in it,is not something that one wants to do regularily.It will change the data retrieval/insert patterns in the system (hopefully for the better/but no guarantee),And will also lock up the table, fill up the log, take lots of time while you are performing the operation.I'm just saying that it's an operation that should be planned,and the clustering of a table should be chosen with care.rockmoose
Lol - did you read my post?>> I'm just saying that it's an operation that should be planned,and the clustering of a table should be chosen with care.nah - I just did it at random.>> And as for the clustering, changing the clustering of a table that already has lots of data in it,is not something that one wants to do regularily.I did it once as stated.>> It could well be that the original indexes in the system were UNIQUE.Which could explain why there were no violations / errors.Guess you aren't responding to my post.==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2006-09-26 : 05:54:55
|
Hi Nigel,No I was not responding to your post The thing that you encountered:that a clustered + non-clustered index on the same columnscould improve performance in some situations caught my attention.That's why I asked for the DDL, just out of curiosity. |
 |
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2006-09-26 : 06:58:34
|
It's just that the table being accessed was very wide and the index (and access) was via an integer id column.That had a clustered unique index and non-clustered non-unique index.Can't post the ddl but the query was just a select * with a join to the other narrow table.The other table had a similar number of rows (a bit less) but with a few duplicates.I'm not convinced using the non-clustered index would be faster - I suspect it's an estimated plan problem.Not sure I'll get round to comparing the execution.==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
TRACEYSQL
Aged Yak Warrior
594 Posts |
Posted - 2006-09-26 : 08:59:53
|
Yes the vendor does not support any changes to the data schema...so we don't let them know...they know we have replication and they just say we do not support it but we need pk for dataI just asked our consultant he said he had non clustered before now we have clustered as he did not understand it....thats good.I am going to do some queries today in our test environment on our database4 (non clustered) and database5 (clustered) and measure it just for fun lol.Thanks |
 |
|
|