SELECT COUNT(DISTINCT) ridiculously slow

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

Transact-SQL (2008)

SELECT COUNT(DISTINCT) ridiculously slow

Author

Topic

brianberns
Starting Member

10 Posts

Posted - 2010-12-30 : 13:39:37

I need to count the number of distinct rows returned by a query. The execution plan for this SQL statement includes a "table spool" (temporary table) with the following attributes:

Estimated number of rows: 5262
Actual number of rows: 347,744,315
Estimated number of executions: 282
Actual number of executions: 49,988

This query takes 100 seconds to count 111 rows. Clearly, the optimizer has chosen a poor plan. If I fetch the actual rows (instead of the count), the query returns in less than a second (and uses an entirely different execution plan, with no temp table).

I have analyzed the query in the tuning advisor and created the recommended statistics and indexes. This has no effect on the query plan or performance.

The query itself is:


select count(distinct record0_.PK)
from Record record0_
join Record record1_ on record1_.EntityFK=record0_.EntityFK 
join RecordTextValue textvalues2_ on record1_.PK=textvalues2_.RecordFK 
where textvalues2_.FieldFK = 'c08a9473-1308-11e0-b39d-0021869a7edf' 
and record0_.RecordTypeFK='c2a0ffa5-d23b-11db-9ea3-000e7f30d6a2' 
and (textvalues2_.Value like 'o%' escape '~')

What can I do to get reasonable performance from this query? I would be happy to supply the table definitions if that would help.

Kristen
Test

22859 Posts

Posted - 2010-12-30 : 13:45:58

If you change the JOINs to EXISTS and just use a COUNT(*) does that help?

brianberns
Starting Member

10 Posts

Posted - 2010-12-30 : 14:00:55

quote:
Originally posted by Kristen

If you change the JOINs to EXISTS and just use a COUNT(*) does that help?

That does not help. The execution plan appears to be identical.

Here's the modified query I tried:

select count(*)
from Record record0_
where record0_.RecordTypeFK='c2a0ffa5-d23b-11db-9ea3-000e7f30d6a2' 
and exists (
	select *
	from Record record1_
	join RecordTextValue textvalues2_ on record1_.PK=textvalues2_.RecordFK 
	where textvalues2_.FieldFK = 'c08a9473-1308-11e0-b39d-0021869a7edf' 
	and (textvalues2_.Value like 'o%' escape '~')
	and record1_.EntityFK = record0_.EntityFK
)

brianberns
Starting Member

10 Posts

Posted - 2010-12-30 : 14:15:40

I also tried this as a subquery with no improvement. Again the execution plan is nearly identical.

select count(*)
from Record record0_
where record0_.EntityFK in (
	select record1_.EntityFK
	from Record record1_
	join RecordTextValue textvalues2_ on record1_.PK=textvalues2_.RecordFK
	where textvalues2_.FieldFK = 'c08a9473-1308-11e0-b39d-0021869a7edf' 
	and (textvalues2_.Value like 'o%' escape '~')
)
and record0_.RecordTypeFK='c2a0ffa5-d23b-11db-9ea3-000e7f30d6a2'

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2010-12-30 : 14:18:51

How many rows are in each table? Also, is there a reason to self join the Record table?

brianberns
Starting Member

10 Posts

Posted - 2010-12-30 : 15:02:31

Record table contains 800,000 rows. RecordTextValue table contains 90,000 rows. The semantics of the query require a self-join. I'm using an EAV pattern, if that is useful to know. In English, the query is something like: "Find all Records of type X where the record's entity has a name that starts with the letter O".

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources