Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Convert Function to Set Based

Author  Topic 

midavis
Starting Member

23 Posts

Posted - 2010-07-30 : 11:33:00
I have a function that returns whether an item is the latest one in the system. I will use CheckItemId as the item id we want to check on.

Table1 stores all items.
Table2 stores a relationship between the revision of an item and when it was done. This table will include the ItemId for the revision that ties back to Table1 as well as the original ItemId (As OrigItemId) this was based from.
Table3 is similar to Table3 but only occurs for certain types of items and stores different types of data.

The logic is this.
1. Does this ItemId exist in Table2 as either the ItemId or OrigItemId? If not look in Table3
2. Does this ItemId exist in Table3 as either the ItemId or OrigItemId? If not then the ItemId is the original Item
3. If the item exists in either Table2 or Table3 we need to grab all occurrences from each table and order them by date desc. We will then grab the latest ItemId from this result. If this ItemId matches the CheckItemId then this is the latest item.

Below is the solution I have currently, but would like to try and use a set based approach. Currently when I do a query I have to call this function for each ItemId to grab the latest Item and a query for 70,000+ rows takes 23 seconds. No more query optimizations need to be done.

As a side note, it would be great to be able to put this in a view and index it because a lot of reports will use this information. Also, I know I can update this query a bit to get rid of the table variable, but I want to focus on how to get this as a set based solution.

Thanks in advance.



DECLARE @CheckItemId int

-- Declare the return variable here
DECLARE @Result bit
DECLARE @OrigItemId int
DECLARE @ItemId int

SET @Result = 0 -- Set the default return value to 0

-- try to find the orig ItemId from Table2
Select @OrigItemId = OrigItemId FROM Table2 Where ItemId = @CheckItemId or OrigItemId = @CheckItemId

-- this does not exist in Table2 so try Table3
if @OrigItemId is null
Select @OrigItemId = OrigItemId FROM Table3 Where ItemId = @CheckItemId or OrigItemId = @CheckItemId

-- this does not exist in either table so this is the original item and the current version
if @OrigItemId IS NULL
set @Result = 1
else
BEGIN

-- create a table to store the items from each table
DECLARE @tmpData TABLE (ItemId int , Dt datetime)

Insert Into @tmpData
SELECT ItemId, EffDt
FROM Table2
WHERE OrigItemId = @OrigItemId

Insert Into @tmpData
SELECT ItemId, EffDt
FROM Table3
WHERE OrigItemId = @OrigItemId

-- grab the latest item
SELECT TOP 1 @ItemId = ItemId From @tmpData order by Dt DESC

if @ItemId = @CheckItemId
set @Result = 1
END

print @Result

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2010-07-30 : 15:04:58
SELECT TOP(1) *
FROM (
SELECT 1 AS TableNum, ItemID, EffDt FROM Table1 UNION ALL
SELECT 2 AS TableNum, ItemID, EffDt FROM Table2 UNION ALL
SELECT 3 AS TableNum, ItemID, EffDt FROM Table3
) AS d
ORDER BY TableNum, EffDt DESC



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

vijayisonly
Master Smack Fu Yak Hacker

1836 Posts

Posted - 2010-07-30 : 15:43:45
Since you alo need to check against OrigItemId...maybe even this?
SELECT TOP(1) *
FROM (
SELECT 1 AS TableNum, ItemID, EffDt FROM Table1 UNION ALL
SELECT 2 AS TableNum, ItemID, EffDt FROM Table2 UNION ALL
SELECT 3 AS TableNum, OrigItemId as [ItemID], EffDt FROM Table2 UNION ALL
SELECT 4 AS TableNum, ItemID, EffDt FROM Table3
SELECT 5 AS TableNum, OrigItemId as [ItemID], EffDt FROM Table3
) AS d
ORDER BY TableNum, EffDt DESC

Go to Top of Page

midavis
Starting Member

23 Posts

Posted - 2010-07-30 : 16:18:10
Where does my CheckItemId fit into the query? Also, if I was wanting to get the full list of all ItemIds from Table1 that were the most recent?
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2010-07-30 : 17:04:50
Please post some relevant sample data to fully demonstrate all possible outcomes.



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

midavis
Starting Member

23 Posts

Posted - 2010-07-30 : 17:12:00
Table1
ItemID
1
2
3
4
5
6
7
8
9
10

Table 2
ItemID OrigPolID EffDt
3 2 2/1/2010
4 2 3/1/2010
7 5 2/2/2010
8 6 6/1/2010

Table 3
ItemID OrigPolID EffDt
9 2 2/15/2010
10 2 2/22/2010



If i want to run this query against each of these item id to check if it is the latest this is what i get

ItemID IsLatest
1 Y
2 N
3 N
4 Y
5 N
6 N
7 Y
8 Y
9 N
10 N
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2010-07-30 : 17:30:16
[code]SELECT TOP(1) CAST(COUNT(1) + CASE @CheckItemID WHEN ItemID THEN 1 ELSE -1 END AS BIT)
FROM (
SELECT ItemID,
EffDT,
1 AS Priority
FROM Table2
WHERE @CheckItemID IN (OrigItemID, ItemID)

UNION ALL

SELECT ItemID,
EffDT,
2 AS Priority
FROM Table3
WHERE @CheckItemID IN (OrigItemID, ItemID)
) AS d
ORDER BY Priority,
EffDT DESC[/code]

N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

midavis
Starting Member

23 Posts

Posted - 2010-07-30 : 17:34:03
Sorry, this is giving an error.

Column 'd.ItemID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Also, I don't see anything tying back to Table1. How would I do this for every item in the Table1 table?
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2010-07-30 : 17:56:32
[code]DECLARE @Table1 TABLE
(
ItemID INT
)

INSERT @Table1
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
SELECT 6 UNION ALL
SELECT 7 UNION ALL
SELECT 8 UNION ALL
SELECT 9 UNION ALL
SELECT 10

DECLARE @Table2 TABLE
(
ItemID INT,
OrigPolID INT,
EffDt DATETIME
)

INSERT @Table2
SELECT 3, 2, '2/1/2010' UNION ALL
SELECT 4, 2, '3/1/2010' UNION ALL
SELECT 7, 5, '2/2/2010' UNION ALL
SELECT 8, 6, '6/1/2010'

DECLARE @Table3 TABLE
(
ItemID INT,
OrigPolID INT,
EffDt DATETIME
)

INSERT @Table3
SELECT 9, 2, '2/15/2010' UNION ALL
SELECT 10, 2, '2/22/2010'

-- CREATE VIEW dbo.vwThisIsMyView
-- AS

SELECT t1.ItemID,
CASE
WHEN f.OrigPolID IS NULL THEN 'Y'
WHEN g.ItemID = t1.ItemID THEN 'Y'
ELSE 'N'
END AS IsLatest
FROM @Table1 AS t1
OUTER APPLY (
SELECT TOP(1) OrigPolID
FROM (
SELECT t2.OrigPolID
FROM @Table2 AS t2
WHERE t1.ItemID IN (t2.ItemID, t2.OrigPolID)

UNION ALL

SELECT t3.OrigPolID
FROM @Table3 AS t3
WHERE t1.ItemID IN (t3.ItemID, t3.OrigPolID)
) AS d
) AS f(OrigPolID)
OUTER APPLY (
SELECT TOP(1) ItemID
FROM (
SELECT t2.ItemID,
t2.EffDt
FROM @Table2 AS t2
WHERE t2.OrigPolID = f.OrigPolID

UNION ALL

SELECT t3.ItemID,
t3.EffDt
FROM @Table3 AS t3
WHERE t3.OrigPolID = f.OrigPolID
) AS d
ORDER BY EffDt DESC
) AS g(ItemID)[/code]


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Celko
Starting Member

23 Posts

Posted - 2010-07-30 : 18:10:01
quote:
Originally posted by midavis

I have a function that returns whether an item is the latest one in the system. I will use CheckItemId as the item id we want to check on.

Table1 stores all items.
Table2 stores a relationship between the revision of an item and when it was done. This table will include the ItemId for the revision that ties back to Table1 as well as the original ItemId (As OrigItemId) this was based from.
Table3 is similar to Table3 but only occurs for certain types of items and stores different types of data.

The logic is this.
1. Does this ItemId exist in Table2 as either the ItemId or OrigItemId? If not look in Table3
2. Does this ItemId exist in Table3 as either the ItemId or OrigItemId? If not then the ItemId is the original Item
3. If the item exists in either Table2 or Table3 we need to grab all occurrences from each table and order them by date desc. We will then grab the latest ItemId from this result. If this ItemId matches the CheckItemId then this is the latest item.

Below is the solution I have currently, but would like to try and use a set based approach. Currently when I do a query I have to call this function for each ItemId to grab the latest Item and a query for 70,000+ rows takes 23 seconds. No more query optimizations need to be done.

As a side note, it would be great to be able to put this in a view and index it because a lot of reports will use this information. Also, I know I can update this query a bit to get rid of the table variable, but I want to focus on how to get this as a set based solution.

Thanks in advance.



DECLARE @CheckItemId int

-- Declare the return variable here
DECLARE @Result bit
DECLARE @OrigItemId int
DECLARE @ItemId int

SET @Result = 0 -- Set the default return value to 0

-- try to find the orig ItemId from Table2
Select @OrigItemId = OrigItemId FROM Table2 Where ItemId = @CheckItemId or OrigItemId = @CheckItemId

-- this does not exist in Table2 so try Table3
if @OrigItemId is null
Select @OrigItemId = OrigItemId FROM Table3 Where ItemId = @CheckItemId or OrigItemId = @CheckItemId

-- this does not exist in either table so this is the original item and the current version
if @OrigItemId IS NULL
set @Result = 1
else
BEGIN

-- create a table to store the items from each table
DECLARE @tmpData TABLE (ItemId int , Dt datetime)

Insert Into @tmpData
SELECT ItemId, EffDt
FROM Table2
WHERE OrigItemId = @OrigItemId

Insert Into @tmpData
SELECT ItemId, EffDt
FROM Table3
WHERE OrigItemId = @OrigItemId

-- grab the latest item
SELECT TOP 1 @ItemId = ItemId From @tmpData order by Dt DESC

if @ItemId = @CheckItemId
set @Result = 1
END

print @Result




Author of
SQL for Smarties
Thinking in Sets
SQL Puzzles and Answers
Analytics and OLAP in SQL
SQL Programming Style
Standards in SQL
Data & Databases
Trees & Hierarchies in SQL
Go to Top of Page

Celko
Starting Member

23 Posts

Posted - 2010-07-30 : 18:26:08
>> Table1 stores all items.
Table2 stores a relationship between the revision of an item and when it was done.
Table3 is similar to Table3 [sic] but only occurs for certain types of items and stores different types of data.<<

Have you ever heard of "Attribute Splitting" and Codd's Information Principle? This means that ONE AND ONLY ONE table models a set of entities. You do not have a "MalePersonnel" and "FemalePersonnel"; you have a "Personnel" table instead.

From you narrative (No DDL!) you should have an Items history table and not separate revisions table. Your narrative is full of "look for it here, then look for it there, etc.) which is pure, evil PROCEDURAL control flow logic when RDBMS is based on declarative programming. See the splitting? Hear Dr. Codd screaming at you?





Author of
SQL for Smarties
Thinking in Sets
SQL Puzzles and Answers
Analytics and OLAP in SQL
SQL Programming Style
Standards in SQL
Data & Databases
Trees & Hierarchies in SQL
Go to Top of Page

midavis
Starting Member

23 Posts

Posted - 2010-08-05 : 14:19:25
I am still checking if this matches every ItemId I have, but I do know that if I run this against every item id using my function it takes 22 seconds. If I do it with this query it takes 3 minutes.

If I look at running the function for just one ItemId i get the statistics below.

Table 'Table2'. Scan count 2, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Table3'. Scan count 2, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

If I run it with the query above for all records (74300 rows) I get the below results

Table 'Table3'. Scan count 1, logical reads 3696112, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Table2'. Scan count 64956, logical reads 4881419, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Table'. Scan count 1, logical reads 1670, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

There are only 4971 rows in Table3 and 4058 in Table2.
Go to Top of Page
   

- Advertisement -