Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
rkhanna
Starting Member
3 Posts |
Posted - 2007-11-01 : 00:09:46
|
I have the following table structure:tableA (~85,000 rows) primary key = [colA,colB]tableB (~850,000 rows) primary key = [colA,colC]tableC (~120,000,000 rows) primary key = [colA,colB,colC]IMPORTANT: colC is DATETIMEFor a SET of rows in tableA (about 50,000) I need to pull the MOST RECENT (given a date) corresponding values from tables B and C. The only way I can think of doing this is the following:SELECT tableA.colA,(SELECT TOP 1 colX FROM tableB WHERE colA = tableA.colA AND colC <= @INPUTDATE ORDER BY colC desc),(SELECT TOP 1 colY FROM tableB WHERE colA = tableA.colA AND colC <= @INPUTDATE ORDER BY colC desc),... --some more columns from tableB,(SELECT TOP 1 colX FROM tableC WHERE colA = tableA.colA AND colB = tableA.colB AND colC <= @INPUTDATE ORDER BY colC desc),(SELECT TOP 1 colY FROM tableC WHERE colA = tableA.colA AND colB = tableA.colB AND colC <= @INPUTDATE ORDER BY colC desc),... --some more columns from tableCFROM tableAWHERE tableA.colX = 'some criteria' Is there any other way anyone can suggest? Unfortunately, because tableC is so large, the disk IO (I think) causes this query to take over an hour. (If I had monster RAM and super fast disk this wouldn't be as big an issue, but that's not an option right now )Thanks in advance! |
|
madhivanan
Premature Yak Congratulator
22864 Posts |
Posted - 2007-11-01 : 02:25:55
|
Replace top 1 subqueries with the logic like thisSELECT Max(colX) FROM tableB WHERE colA = tableA.colA AND colC <= @INPUTDATEMadhivananFailing to plan is Planning to fail |
 |
|
khtan
In (Som, Ni, Yak)
17689 Posts |
Posted - 2007-11-01 : 02:46:28
|
don't use sub-query. Change your query to using JOINS. KH[spoiler]Time is always against us[/spoiler] |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-11-01 : 06:23:57
|
[code]SELECT A.colA, B.colX, ..., C.ColZ, ...FROM ( SELECT C.colA, C.ColB, MAX(C.ColC) AS MAX_C FROM TableC AS C JOIN TableA AS A WHERE A.colX = 'some criteria' GROUP BY C.colA, C.ColB) AS X JOIN tableA AS A ON A.ColA = X.ColA JOIN tableB AS B ON B.ColA = X.ColA AND B.ColB = X.ColB JOIN tableC AS C ON C.ColA = X.ColA AND C.ColB = X.ColB AND C.ColC = X.MAX_CWHERE A.colX = 'some criteria'[/code]For this to run efficiently you will need an index on TableA for ColX, ColA (if the Primary Key on TableA is clustered then ColA is [I think!!] automatically covered, in which case you won't need to include that)Kristen |
 |
|
rkhanna
Starting Member
3 Posts |
Posted - 2007-11-03 : 17:35:30
|
That worked great, thanks Kristen! I got it down to just a couple of minutes. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-11-04 : 02:12:01
|
Over an hour to a couple of minutes? That's good news indeed! |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-11-05 : 02:50:21
|
Are you using SQL Server 2005? E 12°55'05.25"N 56°04'39.16" |
 |
|
rkhanna
Starting Member
3 Posts |
Posted - 2007-12-04 : 21:11:24
|
Yes, I am using SQL Server 2005. |
 |
|
|
|
|
|
|