Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SQL Server Administration (2005)
 Understanding Index Better

Author  Topic 

dewacorp.alliances

452 Posts

Posted - 2009-03-23 : 15:37:41
Hi there

I have a SourceBatchItems table that is used for pumping data from csv text file. The data is load per SourceBatchID with potentially 1.5 million records and this table will used for transformation at later stage.

The SourceBatchItems table schemas as follow:
RowID BIGINT NOT NULL IDENTITY
Column1 varchar(4) NULL
Column2 varchar(8) NULL
Column3 varchar(64) NULL
Column4 varchar(16) NULL
Column5 varchar(16) NULL
Column6 varchar(16) NULL
Column7 varchar(32) NULL
Column8 varchar(64) NULL
Column9 varchar(16) NULL
Column10 varchar(16) NULL
Column11 varchar(128) NULL
Column12 varchar(16) NULL
Column13 varchar(16) NULL
Column14 varchar(16) NULL
Column15 varchar(32) NULL
Column16 varchar(16) NULL
Column17 varchar(16) NULL
Column18 varchar(64) NULL
SourceBatchID INT
CreatedUTCTimeStamp DATETIME NULL

The query in stored procedure that I run to manipulate for the transformation is:

SELECT CAST(RTRIM(LTRIM(Column4) AS VARCHAR(32) AS AccountNo,
CAST(RTRIM(LTRIM(COlumn6)) AS VARCHAR(32)) AS ProductDescription2
FROM SourceBatchItems (NOLOCK)
WHERE RTRIM(LTRIM(COlumn7)) = '1' AND RTRIM(LTRIM(Column8)) = 'A' AND RTRIM(LTRIM(COlumn9)) = '' AND RTRIM(LTRIM(Column9)) = 'D'
AND (LTRIM(RTRIM(Column11)) <> 'SUMMARY OF CHARGES'
AND (LTRIM(RTRIM(Column11)) <> 'EXC GST'
AND SourceBatchID = @SourceBatchID


Basically, I get from the SOurceBatchItems per SourceBatchID and Column7, Column8, Column9, Column10 considered as Code while the Column11 is kinda description. All queries for manipulation/transformation is similar but more to STRING manipulation on Column11.

Now for the indexing, originally this table has PRIMARY KEY so I drop this cause I don't see the usefull of this. Instead of I am thinking do this:

CREATE CLUSTERED INDEX IX_SourceBatchItems_1
ON SourceBatchItems(SourceBatchID, Column7, Column8, Column9, Column10)
WITH (FILL FACTOR = 100, PAD_INDEX = ON)

What do you think? Do I need to put Column11 as well? BTW this index is below 900 bytes so I can include this COlumn11 potentially.

How do I test this though to see this?

I am apprecited your comment.

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2009-03-23 : 15:53:20
I'd compare the execution plans and query times for each potential index by viewing the execution plans in Management Studio. View the actual execution plan which can be selected in the Query menu. Alternatively you can view the estimated execution plan if the query takes a long time to run. Pick the index that gives the best performance and doesn't hinder writes too much.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog

"Let's begin with the premise that everything you've done up until this point is wrong."
Go to Top of Page

dewacorp.alliances

452 Posts

Posted - 2009-03-23 : 16:05:02
quote:
Originally posted by tkizer

I'd compare the execution plans and query times for each potential index by viewing the execution plans in Management Studio. View the actual execution plan which can be selected in the Query menu. Alternatively you can view the estimated execution plan if the query takes a long time to run. Pick the index that gives the best performance and doesn't hinder writes too much.



Hi Tara

Does this mean I have test the index one first and drop/change it and see which one is faster .. I believe. The thing is I am not sure what do I need to have a look in the Execution Plans though.

Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2009-03-23 : 16:10:56
Well you could add both indexes and then use an index hint in your query. If you use index hints, you could then run them in the same batch which would tell you exactly which one is better just by looking at the cost.

You'll need to do some reading on execution plans as the topic is too complex to discuss in a forum post.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog

"Let's begin with the premise that everything you've done up until this point is wrong."
Go to Top of Page

dewacorp.alliances

452 Posts

Posted - 2009-03-23 : 21:49:12
quote:
Well you could add both indexes and then use an index hint in your query. If you use index hints, you could then run them in the same batch which would tell you exactly which one is better just by looking at the cost.

You'll need to do some reading on execution plans as the topic is too complex to discuss in a forum post.



Hi Tara

The index that i have is Clustered Index and I can't have more than 1 though. I guess drop and recreate and putting the index hints that is why to go.


Go to Top of Page
   

- Advertisement -