Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Selecting Top 300 Distinct Records

Author  Topic 

d473566
Starting Member

23 Posts

Posted - 2003-07-11 : 09:37:31
This is the query that is created (created dynamically from a web page). All works fine, but I only want to select the top 300 DISTINCT package.ID, from a result where there are multiple rows returned for each package.id, because there are multiple categories (sometimes 2, sometimes 3) associated with each package.

See the link below for the results and query.

Thanks in advance, for any help/insight...

[url]http://209.47.167.96/search.rpt[/url]

Amethystium
Aged Yak Warrior

701 Posts

Posted - 2003-07-11 : 09:52:28
Have a GROUP BY clause then join on the Category table joining on Package ID and CreationDate?


---------------
Shadow to Light

Edited by - Amethystium on 07/11/2003 09:55:09
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-07-11 : 10:00:12
Well, I had to responde to a numerological brethern..

quote:

I only want to select the top 300 DISTINCT package.ID



SELECT TOP 300 PackageId

quote:

from a result where there are multiple rows returned for each package.id



FROM myTable
GROUP BY PackageId
HAVING COUNT(*) > 1



Is that close?




Brett

8-)

Edited by - x002548 on 07/11/2003 10:00:58
Go to Top of Page

d473566
Starting Member

23 Posts

Posted - 2003-07-14 : 10:16:08
Thanks, but it is not exactly what I am after.

The number of rows to select is variable- what I want is to select is 300 packages. Each package may have 1-4 rows associated with it, depending upon how many categories are linked to the package. Therefore, selecting 300 rows will not work, because it assumes 1 row per package.

I have tried your suggestion, and posted the results here:
http://209.47.167.96/search2.rpt

What it yields is 150 distinct packages- with each package having 2 rows...

My guess is that I will need to use a temp table to dump the results to, and then pull from that. Of course, if I can do it in one query, I would prefer that.

Any insight?

Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2003-07-14 : 11:02:06
how about:


select distinct packageID
from
(select top 300 packageID
from
Sometable
Order by .... something ....
) a

that will return the distinct packages that make up the top 300 totals rows in your table. be sure to include an ORDER BY or the TOP clause makes no sense.


- Jeff
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-07-14 : 11:15:26
Ok,

What I think I'm hearing is

1. You want the first 300 distinct packages..

SELECT TOP 300 DISTINCT PackageId FROM myTable ORDER BY something

2. Then you want all the Details For thos packages:

SELECT * FROM myTable WHERE PackageId IN (
SELECT TOP 300 DISTINCT PackageId FROM myTable ORDER BY something
)

No?



Brett

8-)
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2003-07-14 : 12:37:11
ohh ,, i think brett's right, i think i had it backwards...

- Jeff
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-07-14 : 13:21:07
Jeff,

That's what it sounds like what he wants...yes?

Now the real question?

Why do you want only the "TOP" 300 packages

What about the rest?

What will you ever do with them?

Are you creating "batches"? If so, are you removingf the first 300 you get?

If not, you'll aways get the same 300.

Let us know what you're doing...



Brett

8-)
Go to Top of Page

d473566
Starting Member

23 Posts

Posted - 2003-07-15 : 10:49:46
It is a dynamic search, and it is returning to a Flash front end. To save time (we dont want to process 2000 packages to only show 300...), we are going to display a max of 300 packages, and push a message saying the max has been returned, and to refine your search.

I am going to try the approach that was suggested, and post the outcome.

Thanks for the help thus far...

Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-07-15 : 11:24:32
Understand though, that you're distinct 300 is really 300*number of details per package...

If the average is 3, that's 900 rows.




Brett

8-)
Go to Top of Page

d473566
Starting Member

23 Posts

Posted - 2003-07-15 : 11:50:58
That's right. It may be 900 rows, or 1000, or whatever, depending upon how many rows are tied to each package.

I have tried the code you suggested, but am getting the following error:
Server: Msg 156, Level 15, State 1, Line 35
Incorrect syntax near the keyword 'DISTINCT'.

which refers to this piece of code:

SELECT * FROM @results WHERE ID IN (
SELECT TOP 300 DISTINCT ID FROM @results ORDER BY ID)

Am I missing something really obvious here (been looking at this query far too long...)?

Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-07-15 : 12:16:13
Whats @results? you need a table name there...

(why does my dynamic sql alarm keep going off?)



Brett

8-)
Go to Top of Page

d473566
Starting Member

23 Posts

Posted - 2003-07-15 : 12:37:24
Sorry--should have explained what I am doing. I put the results of the query into a temp table (@results), and then select the 300 distinct records from there.


Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-07-15 : 12:52:07
Is it possible to post the whole script (and nothing but the script, so help you god?)



Brett

8-)
Go to Top of Page

d473566
Starting Member

23 Posts

Posted - 2003-07-15 : 17:02:12
OK- you asked for it...here is the dynamic part.




ALTER PROCEDURE PackageSearchListNEW(
@AreaID INT,
@listCategoryID VARCHAR(1000),
@packageName VARCHAR(500),
@packageNameIsExact BIT,
@CreatedStartDate smallDateTime,
@CreatedEndDate smallDateTime,
@CreatedByUserID INT,
@ForwardedStartDate smallDateTime,
@ForwardedEndDate smallDateTime,
@ForwardedByUserID INT,
@StatusPending INT,
@StatusByUserID INT,
@StatusClaimed INT,
@StatusFinalized INT,
@DueDate smallDateTime
)
AS
--these variables are used to create the SQL
DECLARE @part1 VARCHAR(2000)
DECLARE @part2 VARCHAR(2000)
DECLARE @part3 VARCHAR(150)
DECLARE @part4 VARCHAR(500)
DECLARE @part4b VARCHAR(500)
DECLARE @part4c VARCHAR(50)
DECLARE @part5 VARCHAR(150)
DECLARE @part6 VARCHAR(150)
DECLARE @part7 VARCHAR(300)
DECLARE @part7a VARCHAR(100)
DECLARE @part8 VARCHAR(100)
DECLARE @part9 VARCHAR(100)
DECLARE @SQL VARCHAR(8000)
SET @part1 = ''
SET @part2 = ''
SET @part3 = ''
SET @part4 = ''
SET @part4b = ''
SET @part4c = ''
SET @part5 = ''
SET @part6 = ''
SET @part7 = ''
SET @part7a = ''
SET @part8 = ''
SET @part9 = ''
-----------------------------------------------------
--Main Part of the search query
--gets loaded into @part1
-----------------------------------------------------
SET @part1 = '
DECLARE @results TABLE(
ID INT,
packageName varchar(250),
CreatedDateTime smalldatetime,
DueDate smalldatetime,
isUrgent INT,
CategoryName VARCHAR(250),
CategoryID INT,
[Sequence] INT,
Status INT,
StatusBy VARCHAR(250)) '
SET @part1 =@part1 +
'INSERT INTO @results
SELECT dbo.Package.ID,
dbo.Package.Name AS PackageName,
dbo.Package.createdDateTime,
dbo.Package.DueDate,
dbo.Package.isUrgent,
dbo.Category.Name AS CategoryName,
dbo.Category.ID AS CategoryID,
dbo.AreaTier.Sequence,
(dbo.getPackageStatus(dbo.Package.ID)) AS Status,
(dbo.getPackageStatusBy(dbo.Package.ID)) AS statusBy
FROM dbo.Package
INNER JOIN dbo.PackageCategory ON dbo.Package.ID = dbo.PackageCategory.PackageID
INNER JOIN dbo.Category ON dbo.PackageCategory.CategoryID = dbo.Category.ID
INNER JOIN dbo.AreaTier ON dbo.Package.AreaID = dbo.AreaTier.AreaID AND dbo.Package.AreaID = dbo.AreaTier.AreaID
INNER JOIN dbo.Tier ON dbo.Category.TierID = dbo.Tier.ID AND dbo.AreaTier.TierID = dbo.Tier.ID
WHERE (dbo.Package.AreaID = ' + CAST(@AreaID AS VARCHAR) + ')
AND (dbo.Package.UnderConstruction = 0)
AND (dbo.Package.Template = 0) '
-----------------------------------------------------
--Criteria for dynamic search on category ID's
--gets loaded into @part2
-----------------------------------------------------
IF LEN(@listCategoryID) > 0
BEGIN
SET @part2 =' AND ('
DECLARE @array VARCHAR(2001) -- @Array is the array we wish to parse
SET @array = @listCategoryID
DECLARE @separator VARCHAR(1) -- @Separator is the separator charactor such as a comma
SET @separator = ','
DECLARE @separator_position int -- This is used to locate each separator character
DECLARE @array_value VARCHAR(10) -- this holds each array value as it is returned
-- For my loop to work I need an extra separator at the end. I always look to the
-- left of the separator character for each array value
SET @array = @array + @separator

-- Loop through the string searching for separtor characters
WHILE patindex('%' + @separator + '%' , @array) <> 0
BEGIN
-- patindex matches the a pattern against a string
SELECT @separator_position = patindex('%' + @separator + '%' , @array)
SELECT @array_value = left(@array, @separator_position - 1)
-- This is where you process the values passed.
-- Replace this select statement with your processing
-- @array_value holds the value of this element of the array
SET @part2 = @part2 + '(dbo.isPackageAssigned(dbo.Package.ID,' + CAST(@array_value AS VARCHAR) + ') = 1) AND '
-- This replaces what we just processed with and empty string
SELECT @array = stuff(@array, 1, @separator_position, '')
END
SET @part2 = @part2 + '0=0)' --this here because the loop will always have an additional "and" on the end
--the barcket closes out this subquery section
END
-----------------------------------------------------
--Criteria for dynamic search on package name
--gets loaded into @part3
-----------------------------------------------------
IF LEN(@packageName) > 0
SET @packageName = lower(@packageName)
BEGIN
IF @packageNameIsExact=1
BEGIN
SET @part3 = ' AND lower(dbo.Package.Name) = ' + CHAR(39) + @packageName + CHAR(39)
END
ELSE
BEGIN
SET @part3 = ' AND lower(dbo.Package.Name) LIKE ' + CHAR(39) + '%' + @packageName + '%' + CHAR(39)
END
END
-----------------------------------------------------
--Criteria for dynamic search on creation dates
--gets loaded into @part4
-----------------------------------------------------
IF @CreatedStartDate IS NOT NULL
BEGIN
SET @part4 = 'AND dbo.Package.ID IN (SELECT DISTINCT PackageID AS ID
FROM dbo.PackageHistory
WHERE ((EventType = ' + CHAR(39) + 'Created' + CHAR(39) + ')
)
AND (EventDateTime >= ' + CHAR(39) + CAST(@CreatedStartDate AS VARCHAR) + CHAR(39) + ' AND EventDateTime <=' + CHAR(39) + CAST(@CreatedEndDate AS VARCHAR) + CHAR(39) + ')'
IF @CreatedByUserID <> 0
BEGIN
SET @part4 = @part4 + ' AND (EventByUserID = ' + CAST(@CreatedByUserID AS VARCHAR) + ')'
END
SET @part4 = @part4 + ')' --this bracket ends the select subquery for the createddate section
END
-----------------------------------------------------
--Criteria for dynamic search on forwarded dates
--gets loaded into @part4b
-----------------------------------------------------
IF @ForwardedStartDate IS NOT NULL
BEGIN
SET @part4b = 'AND dbo.Package.ID IN (SELECT DISTINCT PackageID AS ID
FROM dbo.PackageHistory
WHERE ((EventType = ' + CHAR(39) + 'Forwarded' + CHAR(39) + ')
)
AND (EventDateTime >= ' + CHAR(39) + CAST(@ForwardedStartDate AS VARCHAR) + CHAR(39) + ' AND EventDateTime <=' + CHAR(39) + CAST(@ForwardedEndDate AS VARCHAR) + CHAR(39) + ')'
IF @ForwardedByUserID <> 0
BEGIN
SET @part4b = @part4b + ' AND (EventByUserID = ' + CAST(@ForwardedByUserID AS VARCHAR) + ')'
END
SET @part4b = @part4b + ')' --this bracket ends the select subquery for the createddate section
END
--if the user has selected pending AND claimed AND finalized (or any combination)
--we need to put it all inside a clause

IF @StatusPending + @StatusClaimed + @StatusFinalized > 0
SET @part4c = ' AND ('
-----------------------------------------------------
--Criteria for dynamic search on status = pending
--gets loaded into @part5
-----------------------------------------------------
IF @StatusPending = 1
BEGIN
SET @part5= ' (dbo.getPackageStatus(dbo.Package.ID) = 0'
IF @StatusByUserID <> 0
BEGIN
SET @part5= @part5 + ' AND (dbo.isPackageRecipient(' + CAST(@StatusByUserID AS VARCHAR) + ', dbo.Package.ID) > 0)'
END
SET @part5= @part5 + ')'

IF @StatusClaimed + @StatusFinalized > 0
SET @part5= @part5 + ' OR '
END
-----------------------------------------------------
--Criteria for dynamic search on status = claimed
--gets loaded into @part6
-----------------------------------------------------
IF @StatusClaimed = 1
BEGIN
SET @part6 = '(dbo.getPackageStatus(dbo.Package.ID) = 1'
IF @StatusByUserID <> 0
BEGIN
SET @part6=@part6 + ' AND (dbo.getPackageStatusByUserID(dbo.Package.ID) = ' + CAST(@StatusByUserID AS VARCHAR) + ')'
END
SET @part6=@part6 + ')'
IF @StatusFinalized = 1
SET @part6= @part6 + ' OR '
END
-----------------------------------------------------
--Criteria for dynamic search on status = finalized
--gets loaded into @part7
-----------------------------------------------------
IF @StatusFinalized = 1
BEGIN
SET @part7 = ' (dbo.getPackageStatus(dbo.Package.ID) = 2'
IF @StatusByUserID <> 0
BEGIN
SET @part7=@part7 + ' AND dbo.Package.ID IN
(SELECT DISTINCT PackageID AS ID FROM dbo.PackageHistory
WHERE EventType = ' + char(39) + 'Finalized' + char(39) + '
AND EventByUserID = ' + CAST(@StatusByUserID AS VARCHAR) + ')'
END
IF @StatusPending + @StatusClaimed + @StatusFinalized > 0
SET @part7=@part7 + ')'
END
IF @StatusPending + @StatusClaimed + @StatusFinalized > 0
SET @part7a=@part7a + ')'
-----------------------------------------------------
--Criteria for dynamic search on due date
--gets loaded into @part8
-----------------------------------------------------
IF @DueDate IS NOT NULL --<> ''
BEGIN
--when we cast the date as a string, it adds a time of 12:00AM to it.
--We want to change the time to 11:59PM, so it includes the entire day in the search
DECLARE @DueDateString VARCHAR(50)
SET @DueDateString= SUBSTRING(CAST(@DueDate AS VARCHAR),1,(LEN(@DueDate)-7)) + '11:59:00PM'

SET @part8 = ' AND dbo.Package.DueDate <= ' + CHAR(39) + @DueDateString + CHAR(39)
END
-----------------------------------------------------
--Final Clause
--gets loaded into @part9
-----------------------------------------------------
SET @part9 = ' ORDER BY dbo.Package.createdDateTime,dbo.Package.Name'
SET @SQL = (@part1 + @part2 +@part3 + @part4 + @part4b + @part4c + @part5 + @part6 + @part7 + @part7a + @part8 + @part9)
SET @SQL = @SQL + ' SELECT * FROM @results WHERE Id IN (
SELECT TOP 300 DISTINCT ID FROM @results ORDER BY Id)'

EXEC (@SQL)

PRINT(@SQL)



and here is what it boils down to when I pass in values from the search form:
EXEC packageSearchListNEW
@AreaID = 4,
@listCategoryID='',
@packageName = '',
@PackageNameIsExact = 0,
@CreatedStartDate = NULL,
@CreatedEndDate = NULL,
@CreatedByUserID = 100000406,
@ForwardedStartDate = NULL,
@ForwardedEndDate = NULL,
@ForwardedByUserID = 100000406,
@StatusPending = 1,
@StatusClaimed = 1,
@StatusFinalized = 1,
@StatusByUserID = 100000406,
@DueDate = NULL

results in the following:
Server: Msg 156, Level 15, State 1, Line 34
Incorrect syntax near the keyword 'DISTINCT'.

DECLARE @results TABLE(
ID INT,
packageName varchar(250),
CreatedDateTime smalldatetime,
DueDate smalldatetime,
isUrgent INT,
CategoryName VARCHAR(250),
CategoryID INT,
[Sequence] INT,
Status INT,
StatusBy VARCHAR(250)) INSERT INTO @results
SELECT dbo.Package.ID,
dbo.Package.Name AS PackageName,
dbo.Package.createdDateTime,
dbo.Package.DueDate,
dbo.Package.isUrgent,
dbo.Category.Name AS CategoryName,
dbo.Category.ID AS CategoryID,
dbo.AreaTier.Sequence,
(dbo.getPackageStatus(dbo.Package.ID)) AS Status,
(dbo.getPackageStatusBy(dbo.Package.ID)) AS statusBy
FROM dbo.Package
INNER JOIN dbo.PackageCategory ON dbo.Package.ID = dbo.PackageCategory.PackageID
INNER JOIN dbo.Category ON dbo.PackageCategory.CategoryID = dbo.Category.ID
INNER JOIN dbo.AreaTier ON dbo.Package.AreaID = dbo.AreaTier.AreaID AND dbo.Package.AreaID = dbo.AreaTier.AreaID
INNER JOIN dbo.Tier ON dbo.Category.TierID = dbo.Tier.ID AND dbo.AreaTier.TierID = dbo.Tier.ID
WHERE (dbo.Package.AreaID = 4)
AND (dbo.Package.UnderConstruction = 0)
AND (dbo.Package.Template = 0) AND lower(dbo.Package.Name) LIKE '%%' AND ( (dbo.getPackageStatus(dbo.Package.ID) = 0 AND (dbo.isPackageRecipient(100000406, dbo.Package.ID) > 0)) OR (dbo.getPackageStatus(dbo.Package.ID) = 1 AND (dbo.getPackageStatusByUserID(dbo.Package.ID) = 100000406)) OR (dbo.getPackageStatus(dbo.Package.ID) = 2 AND dbo.Package.ID IN
(SELECT DISTINCT PackageID AS ID FROM dbo.PackageHistory
WHERE EventType = 'Finalized'
AND EventByUserID = 100000406))) ORDER BY dbo.Package.createdDateTime,dbo.Package.Name SELECT * FROM @results WHERE Id IN (
SELECT TOP 300 DISTINCT ID FROM @results ORDER BY ID)













Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-07-16 : 09:25:09
Wow...

Be careful what you wish for..

OK,

Let's take this in a slightly differnet direction.

Since your creating a local table variable, why not just make a table variable with the distinct list in it already?

If it has to be a new one, then so be it. Tiy can use the existing work that was done...

INSERT INTO @NewTable([ID]) SELECT DISCTINCT [ID] FROM @Results

Then either Join to the data, or keep the IN Clause...you can now remove the DISTINCT Clause..

But my GUESS is that it's not the DISTINCT syntax.



Brett

8-)
Go to Top of Page

d473566
Starting Member

23 Posts

Posted - 2003-07-16 : 11:08:36
Right on, that worked perfectly. I create the first table with all the rows, and then create a second table with just the distinct ID's, and a counter field which is auto-incremented. Lastly, I select all the rows where the ID's exist in the second table and the counter is less than/equal to 300:

INSERT INTO @results
blah blah blah...

INSERT INTO @newTable(ID)
SELECT DISTINCT(ID) FROM @results

SELECT *
FROM @results
WHERE ID IN(SELECT ID FROM @newTable WHERE counter <=300)


Thanks for your help Brett, much appreciated. I had a feeling that it was going to have to be done this way, but needed a bit more direction. I thought speed would be an issue (isn't it always?) but the few tests I ran seem to be OK, all things considered.

Thanks again.
Derek

Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-07-16 : 11:26:40
Best money I ever spent

quote:

(why does my dynamic sql alarm keep going off?)



I gotta say that I'm not a fan of dynamic sql...it'll be a poor performer usually..

But hey, whatever floats your boat.

I know you've invested a lot of time in to this, but it coyld be written differently.

For example, you're looping to add values to an IN clause (I think).

Why not create a local table variable, populate it wit those values and join to it?

There's got to be a better way.

Bet hey, glad I could help and Good luck.



Brett

8-)
Go to Top of Page

d473566
Starting Member

23 Posts

Posted - 2003-07-16 : 11:54:44
Yeah, I am sure there is a better way. I inherited this code and was tasked with 'making it work'. Now that it works, I am going to try and tune it (now that I have bought myself some time...)


Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-07-16 : 15:26:21
quote:

Yeah, I am sure there is a better way. I inherited this code and was tasked with 'making it work'. Now that it works, I am going to



RUN AWAY

Don't touch it. It's working, right?

Close the lid quietly, and hope noone is looking

move on to something else

You open that code up, you'll have to regression test the shit out of it, to prove your changes produce IDENTICAL results..

and since you said you inherited it, I can now say

That some ugly code...



Brett

8-)
Go to Top of Page
   

- Advertisement -