Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 DISTINCT for only part of the results?

Author  Topic 

aiken
Aged Yak Warrior

525 Posts

Posted - 2003-02-22 : 13:33:20
I've got a piece of my app that shows the 3 most recent snitz forum topics (much like the fromt page of sqlteam.com). What I'd like, though, is to ensure that it only picks one topic from any particular forum. So I want the three most recent topics which have distinct forum_id's.

The query now is simple:
  select top 3 t_subject,topic_id,t_replies
from snitz_topics WITH(NOLOCK)
where cast(t_subject as varbinary(200))<>cast(upper(t_subject) as varbinary(200))
and (forum_id<>1) and (forum_id<>35) and (forum_id<>36) and (forum_id<>40)
order by t_last_post desc


(It screens out posts with topics in all caps, and from a couple of admin type forums that shouldn'r be displayed).

Is there any way to get this to do what I want without using a temp table or table variable and doing a select top 50 / delete / select top 3 kind of operation?

Thanks in advance!
-b

Bambola
Posting Yak Master

103 Posts

Posted - 2003-02-22 : 16:22:56
SELECT s.t_subject
, s.topic_id
, s.t_replies
, s.forum_id
FROM (
SELECT DISTINCT TOP 3 forum_id
FROM snitz_topics WITH (NOLOCK)
WHERE cast(t_subject as varbinary(200)) <> cast(upper(t_subject) as varbinary(200))
AND forum_id NOT IN (1, 35, 36, 40)
) subqry
INNER JOIN snitz_topics s with (NOLOCK)
ON s.forum_id = subqry.forum_id
ORDER BY t_last_post desc

Bambola.

Go to Top of Page

aiken
Aged Yak Warrior

525 Posts

Posted - 2003-02-23 : 02:05:06
Thanks for the idea, but it's not working for me. The way I see it, this is finding the top 3 forums, probably based on the clustered index since the subquery doesn't specify an ORDER BY, and then finding the most recent topics in those forums. If a single forum has the 10 most recent topics, this query returns those first.

Cheers
-b

Go to Top of Page

Bambola
Posting Yak Master

103 Posts

Posted - 2003-02-23 : 03:14:10
You are right. I was not paying attention.
Try this.

SELECT s.t_subject
, s.topic_id
, s.t_replies
, s.forum_id
FROM (
SELECT DISTINCT TOP 3 forum_id, t_last_post
FROM snitz_topics WITH (NOLOCK)
WHERE cast(t_subject as varbinary(200)) <> cast(upper(t_subject) as varbinary(200))
AND forum_id NOT IN (1, 35, 36, 40)
order by t_last_post
) subqry
INNER JOIN snitz_topics s with (NOLOCK)
ON s.forum_id = subqry.forum_id

Bambola.


Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2003-02-23 : 11:07:11
-- Something to play with.

declare @t table
(topic_id int primary key,
forum_id int,
subject varchar(50),
postDate datetime)

insert into @t (forum_id, topic_id, subject, postdate)
select 1,1,'Forum1: First Topic', '1/1/2000' union
select 1,2,'Forum1: Second Topic','2/1/2000' union
select 1,3,'Forum1: Third topic','3/1/2000' union
select 2,4,'Forum2: First Topic', '4/1/2000' union
select 2,5,'Forum2: Second Topic','5/1/2000' union
select 2,6,'Forum2: Third topic','6/1/2000' union
select 3,7,'Forum3: First Topic', '6/1/2000' union
select 3,8,'Forum3: Second Topic','7/1/2000' union
select 3,9,'Forum3: Third topic','8/1/2000' union
select 4,10,'Forum4: First Topic', '4/1/2000' union
select 4,11,'Forum4: Second Topic','5/1/2000' union
select 4,12,'Forum4: Third topic','6/1/2000'

-- We want the latest topic from each forum, based on postdate.
-- From that, we only want to return the top 3 forums, based on
-- the latest post in each of those forums.

-- Only problem: if two topics are both posted at the exact same
-- latest posttime in the same forum, this would return 2 records
-- from that same forum.

select top 3 a.* FROM
@t a
inner join
(
select forum_id, max(postdate) as lastpost
from
@t
group by forum_id
) b
on
a.forum_id = b.forum_id and
a.postdate = b.lastpost
order by a.postdate DESC


-- demo of the potential problem with this:

delete from @t

insert into @t (forum_id, topic_id, subject, postdate)
select 1,1,'Forum1: First Topic', '1/1/2000' union
select 1,2,'Forum1: Second Topic','2/1/2000' union
select 1,3,'Forum1: Third topic','3/1/2000' union
select 2,4,'Forum2: First Topic', '4/1/2000' union
select 2,5,'Forum2: Second Topic','5/1/2000' union
select 2,6,'Forum2: Third topic','6/1/2000' union
select 3,7,'Forum3: First Topic', '6/1/2000' union
select 3,8,'Forum3: Second Topic','8/1/2000' union
select 3,9,'Forum3: Third topic','8/1/2000' union
select 4,10,'Forum4: First Topic', '4/1/2000' union
select 4,11,'Forum4: Second Topic','5/1/2000' union
select 4,12,'Forum4: Third topic','6/1/2000'

-- note that in the above, forum 3 has two latest topics posted
-- at the exact same date/time.

-- now, run the same SQL on the above data to see the potential
-- problem.

select top 3 a.* FROM
@t a
inner join
(
select forum_id, max(postdate) as lastpost
from
@t
group by forum_id
) b
on
a.forum_id = b.forum_id and
a.postdate = b.lastpost
order by a.postdate DESC

-- Not sure if this is anything to worry about; especially
-- if we are dealing with datetime datatypes that store
-- the millisecond of each post.

- Jeff
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2003-02-23 : 12:02:45
You could use the TOP subquery in that other thread you were commenting on and do something like this:

SELECT TOP 3 *
FROM @t AS A
WHERE topic_id = (
SELECT TOP 1 topic_id
FROM @t
WHERE forum_id = A.forum_id
ORDER BY postdate DESC
)
ORDER BY postdate DESC

 
However, when I tried this in a somewhat similar case, the performance was poor since SQL Server ended up doing a loop join with a Sort/TopN on the inner loop. In that case, it was much faster to bite the bullet and put the extra layer of subquery in:


SELECT TOP 3 *
FROM @t AS A
WHERE topic_id = (
SELECT MIN(topic_id)
FROM @t AS B
WHERE postdate = (
SELECT MAX(postdate)
FROM @t AS C
WHERE B.forum_id = C.forum_id
)
AND A.forum_id = B.forum_id
)
ORDER BY postdate DESC



Go to Top of Page

Bambola
Posting Yak Master

103 Posts

Posted - 2003-02-23 : 12:28:08
I knew I should have gone to sleep!

Bambola.

Go to Top of Page

aiken
Aged Yak Warrior

525 Posts

Posted - 2003-02-24 : 17:23:48
Thanks, everyone! I think I follow it, and will try the implementation and report back. This would be a whole lot easier if SQL allowed DISTINCT to apply to a column rather than the whole query; something like:
 select top 3 _subject,topic_id,t_replies, DISTINCT forum_id
from snitz_topics WITH(NOLOCK)
where cast(t_subject as varbinary(200))<>cast(upper(t_subject) as varbinary(200))
and (forum_id<>1) and (forum_id<>35) and (forum_id<>36) and (forum_id<>40)
order by t_last_post desc
Cheers
-b

Go to Top of Page
   

- Advertisement -