| Author |
Topic |
|
CrazyT
Yak Posting Veteran
73 Posts |
Posted - 2010-04-14 : 15:34:11
|
| I am creating a simple table with a lot of recordsbasic design [rowID] [int] IDENTITY(1,1) NOT NULL, [anotherUniqueID] [int] NULL,the other columns are all uniqueis this the best design or should i do something different. Im not use to dealing with tables with this many records and want to amke sure it can be queried properly etc |
|
|
DBA in the making
Aged Yak Warrior
638 Posts |
Posted - 2010-04-14 : 15:48:55
|
| It's indexes that you need to worry about. That will depend on the queries you're running against the data.------------------------------------------------------------------------------------Any and all code contained within this post comes with a 100% money back guarantee. |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
CrazyT
Yak Posting Veteran
73 Posts |
Posted - 2010-04-14 : 16:26:14
|
| if i were making a create table statement how would i include the indexes in it. |
 |
|
|
hanbingl
Aged Yak Warrior
652 Posts |
Posted - 2010-04-14 : 16:47:41
|
at least define primary key and foreign keys. then common searched columns like names, dates. who knows what developers can come up with.quote: Originally posted by CrazyT if i were making a create table statement how would i include the indexes in it.
|
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-04-15 : 02:27:10
|
| " the other columns are all unique"Every column is unique? Kinda unusual ... but maybe it is just a "conversion" from IDs for various parties which refer to the same objects.You need a Unique Constraint (which in SQL will create an index) on all columns.If some columns also allow NULLs then use a Filtered Index (i.e. to only index the NOT NULL values)That will give you best performance on any column that is involved in a JOIN (to another table) |
 |
|
|
DBA in the making
Aged Yak Warrior
638 Posts |
Posted - 2010-04-15 : 04:47:07
|
quote: Originally posted by KristenYou need a Unique Constraint (which in SQL will create an index) on all columns.
I wouldn't create an index on every column.------------------------------------------------------------------------------------Any and all code contained within this post comes with a 100% money back guarantee. |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-04-15 : 04:57:12
|
If:If every column is UNIQUEEvery column is used as a JOIN to another tablethen I would. Seems an improbable scenario though I would have second thoughts though if there are UPDATEs (seems that a UNIQUE value which is an FKey to another table will be unlikely to change (although some may change from NULL to Value, and maybe the other direction too), but if they change like the wind then Indexes may not be a good idea.As ever ... "it depends" ... and if you ask two SQL Gurus you'll get at least 3 answers! |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-04-15 : 04:58:08
|
"I wouldn't create an index on every column"Just in case open to interpretation then I meant a separate index on each column, not one index with keys for every column! |
 |
|
|
DBA in the making
Aged Yak Warrior
638 Posts |
Posted - 2010-04-15 : 05:20:02
|
quote: Originally posted by KristenJust in case open to interpretation then I meant a separate index on each column, not one index with keys for every column!
Just to be 100% clear, I wouldn't do either of those things. That's going to add quite a bit of overhead for each insert, update and delete.------------------------------------------------------------------------------------Any and all code contained within this post comes with a 100% money back guarantee. |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-04-15 : 05:37:01
|
| As I assume it (by way of an example)The table has, say, 10 columns, all UNIQUE, and JOINs to other tables which use each of those 10 (unique) column valuesI'm assuming that 1a) SELECTs are far more common than INSERTS1b) All columns participate in a WHERE or a JOIN reasonably often2) DELETEs are very rare3a) UPDATES may occur which change NULL columns to a VALUE (so will cause an ADD to that index, but no changes to other indexes)3b) UPDATES to other columns (to change an existing value) will be very rareI don't see the problem with an index (on each column) in such circumstances.I agree that if there are significant numbers of updates to existing (indexed) values, or significant numbers of deletes, or lots of column values that are never/rarely selected and never/rarely participate in JOIN or WHERE clauses, then an index would be a waste of time.But would you still not index each column given my assumptions?(Unique Constraint is going to be required on each column, but that could be enforced by Trigger to avoid using an index - e.g. if a column is never used in a JOIN or a WHERE) |
 |
|
|
DBA in the making
Aged Yak Warrior
638 Posts |
Posted - 2010-04-15 : 06:26:54
|
quote: Originally posted by Kristen As I assume it (by way of an example)The table has, say, 10 columns, all UNIQUE, and JOINs to other tables which use each of those 10 (unique) column valuesI'm assuming that 1a) SELECTs are far more common than INSERTS1b) All columns participate in a WHERE or a JOIN reasonably often2) DELETEs are very rare3a) UPDATES may occur which change NULL columns to a VALUE (so will cause an ADD to that index, but no changes to other indexes)3b) UPDATES to other columns (to change an existing value) will be very rare
That's a lot of assumptions. There's nothing from the OP that indicates any of them are true. quote: But would you still not index each column given my assumptions?
It's the assumptions I have issue with. They very well may be correct. But until we know that, the suggestion of 8 additional indexes is a little premature. quote: (Unique Constraint is going to be required on each column, but that could be enforced by Trigger to avoid using an index - e.g. if a column is never used in a JOIN or a WHERE)
A trigger would be worse than an index. If there's no index on the column, then the trigger would have to do a table scan for every record in the table.The other thing to consider is that "unique" isn't always "unique". If I'd put a unique constraint on every column where the client has told me "Yeah, we'll never have 2 of those that are the same", I'd have a lot of broken code.------------------------------------------------------------------------------------Any and all code contained within this post comes with a 100% money back guarantee. |
 |
|
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2010-04-15 : 06:30:54
|
Testing is king.Do you have a UAT environment or similar where you can test out index changes? If so then just create indexes as Kristen suggests and then you can monitor them. Here's a bit of SQL that you can use to get the read's and writes for each index. If an index is written to far more often than read from then it's probably not a great index (except of course if those few reads are vitally important)/*** Index Usage Statistics ***************************************************** Reports the reads and writes to the indices of the current database.** Reads indicate that the index has been used in queries** writes indicate the effort of maintaining the index. Therefore the indicies* with the highest read to write ratio are the indicies used the most be the* system** Charlie (2009-Dec-18)*******************************************************************************/-- Set this to limit the results to tables / indices above a certain no of rowsDECLARE @rowThreshold INT SET @rowThreshold = 1000/*****************************************************************************/SET NOCOUNT ONSELECT o.[name] AS [Table Name] , i.[name] AS [Index Name] , i.[index_id] AS [Index Id] , s.[user_seeks] + s.[user_scans] + s.[user_lookups] AS [Total Reads] , s.[user_updates] AS [Writes] , rws.[rows] AS [Rows] , CASE WHEN s.[user_updates] < 1 THEN 100 ELSE 1.00 * (s.[user_seeks] + s.[user_scans] + s.[user_lookups]) / s.[user_updates] END AS [Reads Per Write] , 'DROP INDEX ' + QUOTENAME(i.[name]) + ' ON ' + QUOTENAME(sc.[name]) + '.' + QUOTENAME(OBJECT_NAME(s.[object_id])) AS [Drop Statement]FROM sys.dm_db_index_usage_stats s JOIN sys.indexes i ON i.[index_id] = s.[index_id] AND s.[object_id] = i.[object_id] JOIN sys.objects o ON s.[object_id] = o.[object_id] JOIN sys.schemas sc ON o.[schema_id] = sc.[schema_id] -- Rows for the object / index JOIN ( SELECT p.[index_Id] , p.[object_ID] , SUM(p.[rows]) AS [rows] FROM sys.partitions p GROUP BY p.[index_Id] , p.[object_Id] ) rws ON rws.[index_Id] = s.[index_Id] AND rws.[object_Id] = s.[object_Id] WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1 AND s.[database_id] = DB_ID() AND i.[type_desc] = 'nonclustered' AND i.[is_primary_key] = 0 AND i.[is_unique_constraint] = 0 AND rws.[rows] > @rowThresholdORDER BY [Reads Per Write] ASC , [Rows] DESC Charlie===============================================================Msg 3903, Level 16, State 1, Line 1736The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-04-15 : 07:35:15
|
| "That's a lot of assumptions"Indeed. But if it is a table full of unique IDs (**IF**!!) then I think they are a reasonable bet.No substitute for the O/P telling us the actual SP though. (Probably a poor choice of acronym in this forum! SP=Starting Price in Racing parlance ...) |
 |
|
|
CrazyT
Yak Posting Veteran
73 Posts |
Posted - 2010-04-15 : 14:58:15
|
| all the other columns are not unique. typo on my part.as for what is being done. I will retrieve a 100 rows at a time (primary key, unique key for the data, and a another column -- passing a start/end row with row_number(). Then all I need to do is update two columns in the database on each rows.So its a simple select but with alot of records. Only 3 columns will have any data to index, with the primary key being one. |
 |
|
|
DBA in the making
Aged Yak Warrior
638 Posts |
Posted - 2010-04-15 : 16:57:53
|
quote: Originally posted by DBA in the makingThe other thing to consider is that "unique" isn't always "unique".
quote: Originally posted by CrazyT all the other columns are not unique. typo on my part.
LOL. ;)quote: as for what is being done. I will retrieve a 100 rows at a time (primary key, unique key for the data, and a another column -- passing a start/end row with row_number(). Then all I need to do is update two columns in the database on each rows.So its a simple select but with alot of records. Only 3 columns will have any data to index, with the primary key being one.
I don't think using row_number is going to be an efficient way of querying such a table. I just tried it on a table with 1.2m rows, and when selecting high row numbers, the number of reads was huge. The execution plan said it used a clustered index scan, which means it's reading each record sequentially from the clustered index.------------------------------------------------------------------------------------Any and all code contained within this post comes with a 100% money back guarantee. |
 |
|
|
hanbingl
Aged Yak Warrior
652 Posts |
Posted - 2010-04-15 : 17:03:29
|
agree leave the pagination to programmers!quote: Originally posted by DBA in the making
quote: Originally posted by DBA in the makingThe other thing to consider is that "unique" isn't always "unique".
quote: Originally posted by CrazyT all the other columns are not unique. typo on my part.
LOL. ;)quote: as for what is being done. I will retrieve a 100 rows at a time (primary key, unique key for the data, and a another column -- passing a start/end row with row_number(). Then all I need to do is update two columns in the database on each rows.So its a simple select but with alot of records. Only 3 columns will have any data to index, with the primary key being one.
I don't think using row_number is going to be an efficient way of querying such a table. I just tried it on a table with 1.2m rows, and when selecting high row numbers, the number of reads was huge. The execution plan said it used a clustered index scan, which means it's reading each record sequentially from the clustered index.------------------------------------------------------------------------------------Any and all code contained within this post comes with a 100% money back guarantee. |
 |
|
|
CrazyT
Yak Posting Veteran
73 Posts |
Posted - 2010-04-16 : 11:29:38
|
| DBA in the making -- i need to retrieve 100 records at a time to pass off to threads to work on. row_number was the only way i could think of.how would you propose doing that? |
 |
|
|
|