Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Select from child table with distinct rows

Author  Topic 

Kristen
Test

22859 Posts

Posted - 2015-03-26 : 23:33:50
I have to pull some data from another application. That has a Parent table and a Child table of Event Dates.

Some of the Events have multiple dates, for a specific Event Type, that's fine ... I'd use a child-table for that of course.

But most of the dates are distinct - only [zero or] one date allowed for a particular Event Type. Personally I would have created Columns in the Parent table for those.

I will create a VIEW with multiple OUTER JOINs to the Child Table so that I can use that in any general purpose queries, but I also have to build a Record Card application - it will display a single record at a time, and will be the majority-use by users.

Is there any merit in having a (more efficient??) single-row query which doesn't have all the OUTER JOINs?

Or even warehousing the data (overnight) to flatten the Event table's distinct values into columns in the Parent table?

Here's some code in case it helps illustrate the scenario

CREATE TABLE TEMP1 -- Parent Table
(
T1_ID int,
T1_NonDate1 int,
T1_NonDate2 varchar(10)
PRIMARY KEY
(
T1_ID
)
)

CREATE TABLE TEMP2 -- Child Table of Event Types and Dates
(
T2_ID int,
T2_T1_ID int,
T2_Type varchar(4),
T2_Date datetime,
PRIMARY KEY
(
T2_ID
)
)

INSERT INTO TEMP1
SELECT 1, 1234, 'FOO' UNION ALL
SELECT 2, 5678, 'BAR'

INSERT INTO TEMP2
SELECT 1001, 1, 'AAA', '20000101' UNION ALL -- AAA is distinct
SELECT 1002, 1, 'BBB', '20000102' UNION ALL -- BBB is distinct
SELECT 1003, 1, 'CCC', '20000201' UNION ALL -- CCC can have multiple rows for each Parent record
SELECT 1004, 1, 'CCC', '20000202' UNION ALL
SELECT 1005, 1, 'CCC', '20000203' UNION ALL
--
SELECT 1006, 2, 'AAA', '20010101' UNION ALL
SELECT 1008, 2, 'CCC', '20010201' UNION ALL
SELECT 1009, 2, 'CCC', '20010202'
GO

-- Sample Query
SELECT T1_ID, T1_NonDate1, T1_NonDate2, [AAA]=A.T2_Date, [BBB]=B.T2_Date
FROM TEMP1 AS T
LEFT OUTER JOIN TEMP2 AS A
ON A.T2_T1_ID = T.T1_ID
AND A.T2_Type = 'AAA'
LEFT OUTER JOIN TEMP2 AS B
ON B.T2_T1_ID = T.T1_ID
AND B.T2_Type = 'BBB'
GO

-- Method 1 - Create View
CREATE VIEW TEMP_V
AS
SELECT [V_T1] = T1_ID,
[V_AAA]=A.T2_Date,
[V_BBB]=B.T2_Date
FROM TEMP1 AS T
LEFT OUTER JOIN TEMP2 AS A
ON A.T2_T1_ID = T.T1_ID
AND A.T2_Type = 'AAA'
LEFT OUTER JOIN TEMP2 AS B
ON B.T2_T1_ID = T.T1_ID
AND B.T2_Type = 'BBB'
GO

-- Method 1 - Retrieve Data from View
SELECT T1_ID, T1_NonDate1, T1_NonDate2, V_AAA, V_BBB
FROM TEMP1
JOIN TEMP_V
ON V_T1 = T1_ID

-- Method 2 - Select data for a single row
DECLARE @AAA datetime,
@BBB datetime,
@MyID int

SELECT @MyID = 1 -- The record I want - e.g. SProc parameter / Value provided by User

-- Retrive data from associated table
SELECT @AAA = CASE WHEN T2_Type = 'AAA' THEN T2_Date ELSE @AAA END,
@BBB = CASE WHEN T2_Type = 'BBB' THEN T2_Date ELSE @BBB END
FROM TEMP2
WHERE T2_T1_ID = @MyID
AND T2_Type IN ('AAA', 'BBB')

-- Select data from appropriate row
SELECT T1_ID, T1_NonDate1, T1_NonDate2, [AAA]=@AAA, [BBB]=@BBB
FROM TEMP1
WHERE T1_ID = @MyID


-- Drop Tables/View
DROP VIEW TEMP_V
GO
DROP TABLE TEMP1
GO
DROP TABLE TEMP2
GO

Bustaz Kool
Master Smack Fu Yak Hacker

1834 Posts

Posted - 2015-03-27 : 17:27:55
So you'll never select data for T2_Type = 'CCC'?



I would rather be the man who bought the Brooklyn Bridge than the one who sold it. -Will Rogers
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2015-03-28 : 11:09:29
Yeah. There are multiple occurrences of some TYPEs of data, and other TYPEs are guaranteed to only have one entry. In this instance I'm interested in getting all the one-only types and display them in a "record card".

The TYPEs that have multiple entries, like the "CCC" one in my example, will have to be displayed in a separate TAB / List / something-like-that.

Personally I would have only used this sort of "child table" for TYPEs that could have multiple rows in the child table ...
Go to Top of Page

huangchen
Starting Member

37 Posts

Posted - 2015-04-02 : 05:51:51
unspammed
Go to Top of Page

ScottPletcher
Aged Yak Warrior

550 Posts

Posted - 2015-04-02 : 12:07:49
I wouldn't try to selectively denormalize the parents that currently can have only one child. Personally I like the more flexible, and in my view more complete, design as it is vs. making that change.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2015-04-08 : 04:36:28
Interesting. Can you clarify why?

Just to be clear the Child Table has multiple rows, but only one row for specific "attributes", so in order to display a Record Card type page I will have to do multiple joins to the Child Table to get maybe a dozen values.

Perhaps I could PIVOT the child table to get multiple columns that I could then JOIN to? (Never had the need to try that before, no idea if it would work)

This is the sort of output I'm getting from SET STATISTICS IO ON - I can see performance being dire for this!


Table 'Worktable'. Scan count 0, logical reads 0
Table 'CHILD1a'. Scan count 0, logical reads 10
Table 'CHILD1'. Scan count 7, logical reads 21
Table 'Worktable'. Scan count 0, logical reads 0
Table 'CHILD2'. Scan count 5, logical reads 10
Table 'CHILD3'. Scan count 10, logical reads 30
Table 'PARENT_TABLE'. Scan count 0, logical reads 3
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2015-04-08 : 04:45:04
In case it helps to try to describe a real-world example consider an order. It has Date Order Placed, date Credit Check completed, various shipment dates, and Date Order Closed.

User would like to see the dates for Order Placed, Credit Checked and Order Closed on a "record card" style display, and a separate grid/tab for the (part) shipment dates
Go to Top of Page

ScottPletcher
Aged Yak Warrior

550 Posts

Posted - 2015-04-08 : 10:30:06
If it's possible to have multiple values, you really can't use columns in the parent table, you need a child table.

You're quite correct to be concerned about potential performance issues, however. The key to performance is to cluster the child table on the parent key first, then, optionally, the child key(s). Never cluster the child table on, say, an identity column first.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2015-04-08 : 10:41:00
The issue is that the developers have put every date relevant to the Parent Table into the Child Table. Many of these only have one instance (which is enforced by rules in the application, so can't have more than one). Hence I wanted to flatten the ones that can only occur once.

I started off thinking I could query the OLTP database itself, using a View, so users would be viewing current data. Performance is not good enough, so I'm going to have to warehouse the data.

We build warehouse tables overnight using a script that only UpSerts rows that have changed ... but if any date in the child table has changed that constitutes a change, so I'm basically having to compare the "view" against the whole of my flat-table ... and it runs like a dog. I'm in the process of trying my one-pass multi-column update, I'll report back if I find that that is faster.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2015-04-08 : 10:50:11
I've checked the source databases clustered indexes and, thankfully!, the CHILD table shares the same leading key fields of the PARENT table. Thanks for pointing that out.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2015-04-08 : 11:07:33


Some figures, so far:

Using a VIEW that JOINs the child table multiple times to find the unique date columns:

Select all records from remote database into a temporary table. 7,911 records, INSERT INTO ... SELECT elapsed time = 506 seconds!

Select just the columns available in the Parent table (into #TEMP table) = < 1 second, then UPDATE the temporary table against the Child Date table (a single pass, but columns updated multiple times) 7,911 rows processed = 1 second

Seems like that is going to be a much more efficient way of "pulling" the data. Sadly none of my existing code than mechanically generates the scripts for this includes such wizardry!
Go to Top of Page
   

- Advertisement -