Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Subselect on large table

Author  Topic 

scullee
Posting Yak Master

103 Posts

Posted - 2004-02-03 : 21:50:41
I have been doing some optimising of queries and one of them has been causing me lots of problems, the performance is dodgy and i know its because of the way its written. After some analysis i have pulled apart the query and simplified it. I need some advice on the best way to fix it.

I have 2 tables
Service_log - 40,000 records
Action_log - 2,500,000 records

Its a simple parent child relationship (as i said, this is simplified) where each service_log has lots of action_log entries.

Here are the tables

CREATE TABLE SERVICE_LOG (
ID int IDENTITY (1, 1) NOT NULL ,
DEVICE_ID int NULL
);

CREATE TABLE ACTION_LOG (
ID int IDENTITY (1, 1) NOT NULL ,
SERVICE_LOG_ID int NOT NULL,
USER_ID VARCHAR(20) NULL,
START_TIME DATETIME NOT NULL
);

Now the problem starts, i need to get the last USER_ID (record with the highest date) for each SERVICE_LOG_ID for a given device.

My current query looks something like this and the performance sux. When you get to about 200 service entries, the system runs the subquery 200 times.
SELECT SERVICE_LOG.ID AS SERVICE_LOG_ID,
(SELECT TOP 1 USER_ID FROM ACTION_LOG WHERE ACTION_LOG.SERVICE_LOG_ID = SERVICE_LOG.ID ORDER BY START_TIME DESC) AS USER_ID
FROM SERVICE_LOG
WHERE SERVICE_LOG.DEVICE_ID = 4266

I have tried other indexes but i just cant get the execute cost down. The more data i get in this table the worse its going to get :(

Does anyone have any ideas on how to fix this, remember that this query is part of a larger query involving about 6 other tables but this is the part that is killing performance






ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2004-02-03 : 22:54:26
How about something like the following:

select id as service_log_id, d2.user_id
from service_log sl
join
(
select al.user_id, d.service_log_id
from action_log al
join
(
select service_log_id, max(al.start_time) last_start_time
from action_log al
group by service_log_id
) d on d.service_log_id = al.service_log_id and al.start_time = d.last_start_time
) d2 on d2.service_log_id = sl.device_id
where sl.device_id = 4266
nonclustered index on service_log(device_id)
clustered index on action_log(service_log_id)
Go to Top of Page

scullee
Posting Yak Master

103 Posts

Posted - 2004-02-03 : 23:08:27
I will try it, i dont like using the time to link the tables because 2 things can (and do) happen at the same time. But i can change it to do that.
Go to Top of Page

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2004-02-04 : 11:31:59
The key to solving your problem is the use of indices on the tables....
As your DDL doesn't show any indices in place...and thus all databases access will be tablescans....thats why your db performance sux.

You say "I have tried other indexes"...but from the above ddl, there doesn't seem to be ANY indices...or else you were too brief in your posting.
Go to Top of Page
   

- Advertisement -