Subselect on large table - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

Subselect on large table

Author

Topic

scullee
Posting Yak Master

103 Posts

Posted - 2004-02-03 : 21:50:41

I have been doing some optimising of queries and one of them has been causing me lots of problems, the performance is dodgy and i know its because of the way its written. After some analysis i have pulled apart the query and simplified it. I need some advice on the best way to fix it.

I have 2 tables
Service_log - 40,000 records
Action_log - 2,500,000 records

Its a simple parent child relationship (as i said, this is simplified) where each service_log has lots of action_log entries.

Here are the tables

CREATE TABLE SERVICE_LOG (
ID int IDENTITY (1, 1) NOT NULL ,
DEVICE_ID int NULL
);

CREATE TABLE ACTION_LOG (
ID int IDENTITY (1, 1) NOT NULL ,
SERVICE_LOG_ID int NOT NULL,
USER_ID VARCHAR(20) NULL,
START_TIME DATETIME NOT NULL
);

Now the problem starts, i need to get the last USER_ID (record with the highest date) for each SERVICE_LOG_ID for a given device.

My current query looks something like this and the performance sux. When you get to about 200 service entries, the system runs the subquery 200 times.
SELECT SERVICE_LOG.ID AS SERVICE_LOG_ID,
(SELECT TOP 1 USER_ID FROM ACTION_LOG WHERE ACTION_LOG.SERVICE_LOG_ID = SERVICE_LOG.ID ORDER BY START_TIME DESC) AS USER_ID
FROM SERVICE_LOG
WHERE SERVICE_LOG.DEVICE_ID = 4266

I have tried other indexes but i just cant get the execute cost down. The more data i get in this table the worse its going to get :(

Does anyone have any ideas on how to fix this, remember that this query is part of a larger query involving about 6 other tables but this is the part that is killing performance

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2004-02-03 : 22:54:26

How about something like the following:


select id as service_log_id, d2.user_id
from service_log sl
join
(
	select al.user_id, d.service_log_id
	from action_log al
	join
	(
		select service_log_id, max(al.start_time) last_start_time
		from action_log al
		group by service_log_id
	) d on d.service_log_id = al.service_log_id and al.start_time = d.last_start_time
) d2 on d2.service_log_id = sl.device_id
where sl.device_id = 4266

nonclustered index on service_log(device_id)
clustered index on action_log(service_log_id)

scullee
Posting Yak Master

103 Posts

Posted - 2004-02-03 : 23:08:27

I will try it, i dont like using the time to link the tables because 2 things can (and do) happen at the same time. But i can change it to do that.

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2004-02-04 : 11:31:59

The key to solving your problem is the use of indices on the tables....
As your DDL doesn't show any indices in place...and thus all databases access will be tablescans....thats why your db performance sux.

You say "I have tried other indexes"...but from the above ddl, there doesn't seem to be ANY indices...or else you were too brief in your posting.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources