Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 SELECT COUNT(DISTINCT) ridiculously slow

Author  Topic 

brianberns
Starting Member

10 Posts

Posted - 2010-12-30 : 13:39:37
I need to count the number of distinct rows returned by a query. The execution plan for this SQL statement includes a "table spool" (temporary table) with the following attributes:
  • Estimated number of rows: 5262
  • Actual number of rows: 347,744,315
  • Estimated number of executions: 282
  • Actual number of executions: 49,988
This query takes 100 seconds to count 111 rows. Clearly, the optimizer has chosen a poor plan. If I fetch the actual rows (instead of the count), the query returns in less than a second (and uses an entirely different execution plan, with no temp table).

I have analyzed the query in the tuning advisor and created the recommended statistics and indexes. This has no effect on the query plan or performance.

The query itself is:

select count(distinct record0_.PK)
from Record record0_
join Record record1_ on record1_.EntityFK=record0_.EntityFK
join RecordTextValue textvalues2_ on record1_.PK=textvalues2_.RecordFK
where textvalues2_.FieldFK = 'c08a9473-1308-11e0-b39d-0021869a7edf'
and record0_.RecordTypeFK='c2a0ffa5-d23b-11db-9ea3-000e7f30d6a2'
and (textvalues2_.Value like 'o%' escape '~')

What can I do to get reasonable performance from this query? I would be happy to supply the table definitions if that would help.

Kristen
Test

22859 Posts

Posted - 2010-12-30 : 13:45:58
If you change the JOINs to EXISTS and just use a COUNT(*) does that help?
Go to Top of Page

brianberns
Starting Member

10 Posts

Posted - 2010-12-30 : 14:00:55
quote:
Originally posted by Kristen

If you change the JOINs to EXISTS and just use a COUNT(*) does that help?



That does not help. The execution plan appears to be identical.

Here's the modified query I tried:
select count(*)
from Record record0_
where record0_.RecordTypeFK='c2a0ffa5-d23b-11db-9ea3-000e7f30d6a2'
and exists (
select *
from Record record1_
join RecordTextValue textvalues2_ on record1_.PK=textvalues2_.RecordFK
where textvalues2_.FieldFK = 'c08a9473-1308-11e0-b39d-0021869a7edf'
and (textvalues2_.Value like 'o%' escape '~')
and record1_.EntityFK = record0_.EntityFK
)
Go to Top of Page

brianberns
Starting Member

10 Posts

Posted - 2010-12-30 : 14:15:40
I also tried this as a subquery with no improvement. Again the execution plan is nearly identical.
select count(*)
from Record record0_
where record0_.EntityFK in (
select record1_.EntityFK
from Record record1_
join RecordTextValue textvalues2_ on record1_.PK=textvalues2_.RecordFK
where textvalues2_.FieldFK = 'c08a9473-1308-11e0-b39d-0021869a7edf'
and (textvalues2_.Value like 'o%' escape '~')
)
and record0_.RecordTypeFK='c2a0ffa5-d23b-11db-9ea3-000e7f30d6a2'
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2010-12-30 : 14:18:51
How many rows are in each table? Also, is there a reason to self join the Record table?
Go to Top of Page

brianberns
Starting Member

10 Posts

Posted - 2010-12-30 : 15:02:31
Record table contains 800,000 rows. RecordTextValue table contains 90,000 rows. The semantics of the query require a self-join. I'm using an EAV pattern, if that is useful to know. In English, the query is something like: "Find all Records of type X where the record's entity has a name that starts with the letter O".
Go to Top of Page
   

- Advertisement -