Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 t-sql query optimisation

Author  Topic 

DLTaylor
Posting Yak Master

136 Posts

Posted - 2013-01-31 : 07:07:25
Hi All,

I had some great help a few weeks back from Sunitabek & Nigelrivett and i'm now hoping this forum could help me again…;-)

I have a query that I’m trying to tune but not getting the change in speed I’d expect.

--t-sql
SELECT t1.VisitID
,t2.OrderID
FROM TABLE t1
LEFT JOIN t2 ON t1.VisitID = t2.VisitID

t1 has 3000 records
t2 has 10,000,000 records

t1 has a RowID (autonumber) that is the PK
t2 has a PK on SourceID, OrderID – his is also the clustered index (clustered, unique, primary key located on PRIMARY)

I only have the VisitID in t1 and t2 and I need to join on these tables to get the OrderID from t2 (to use un subsequent queries)

I have added a nonclustered located on PRIMARY to the VisitID on t2 – and this hasn’t improved the runtime.
In test both with & without the NCI the execution plan shows the queries running at 50% cost relative to batch. (around 1 min per resultset)

A caveat, I can’t change the pk or clustered index on t2 (as its not my table) but can add indexes – but I can do what I wish to t1.

Any advice very welcomed
Thanks

LoztInSpace
Aged Yak Warrior

940 Posts

Posted - 2013-01-31 : 08:33:01
A few pointers:
- You are going to get 3000-10M rows out of this no matter what.
- because of the above, no index will help you. Using an index will make it slower, not faster so unlikely to be used.
- How are you using order ID in "subsequent queries". You will need to incorporate this query into your subsequent queries so you can narrow down the results.
- Don't try to program your query - ask the right question. I suspect you are thinking procedurally and imagining that a first step is to "help" subsequent queries by getting some initial results. You are wrong!

Post the actual data and actual problem and I bet we can help out.

Secondary:
- You would not change the PK for a specific query - the PK exists in the data model no matter what you query
- CI is similar. SQL Server guys seem to worship the CI, but in actual fact it's by no means the first thing you consider. Correct data first, use cases second, CI & indexes pretty much fall out of the second.
Go to Top of Page

DLTaylor
Posting Yak Master

136 Posts

Posted - 2013-01-31 : 11:16:53
Thanks LoztInSpace.
I’ve been reading over your pointers. I see what you are saying about how I’m thinking procedurally. (despite me being wrong, it still sounds good ;-))

I think you have said that I can’t avoid the long initial query, but with some re-work I could improve. Let me try and tighten up my question.
If I have a table (t1) with 3000 rows that is to join to a table (t2) with 10 million rows. The join is on values that are not PK and theer is no relation between the tables.

Is it the case that it will take as long as my server takes to scan the 10 million rows running a full table scan? Or is there a way to speed this up.

My subsequent queries where my 3000 rows link to a tables with 30 million rows runs in 3 seconds as the ‘joins and where clauses involve the PK which is on a CI’

Sorry if my reply is rushed – I will pick up on any questions tomorrow.

Thanks again for your help.
Go to Top of Page
   

- Advertisement -