Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
ken218
Starting Member
4 Posts |
Posted - 2007-08-27 : 08:04:24
|
Hi all, Recently saw some posts about how bad it is using subqueries (nested queries) and things like all the queries can be written with JOIN instead. However, for a query like the followings, I cannot figure out how to do it properly. The data looks like the following, it contains 3 columns, id, which is the primary key of the table (unique), group, as which group this entry belongs to, and the article type, where 0 is not an article.id grp article1 1 02 1 03 1 04 1 35 1 26 1 07 2 08 2 39 2 110 2 011 2 012 2 013 3 514 3 215 3 016 3 017 3 118 3 1What the query is trying to achieve is for each group, get the latest (maximum) id, and the article type (bear in mind "0" is not one of the allowed type). so for data above, the result will look something like this...id grp article5 1 29 2 118 3 1The following is the query I come up with, the problem is, I don't know how to get the same result without using subquery. I don't even see how it is possible. Guys, please share some light with me, and educate me on this.select a.id, a.grp, b.articlefrom (SELECT max(id) as id, grpfrom list awhere article != 0group by grp) ajoin list bon a.id = b.idthanks in advance!!! |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2007-08-27 : 08:14:37
|
in 2000 this is perfectly ok.and who told you that subqueries are bad??_______________________________________________Causing trouble since 1980blog: http://weblogs.sqlteam.com/mladenp |
 |
|
khtan
In (Som, Ni, Yak)
17689 Posts |
Posted - 2007-08-27 : 08:16:34
|
is there a sub-query in there ? I only see derived table KH[spoiler]Time is always against us[/spoiler] |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2007-08-27 : 08:27:18
|
acctually it is a subquery._______________________________________________Causing trouble since 1980blog: http://weblogs.sqlteam.com/mladenp |
 |
|
khtan
In (Som, Ni, Yak)
17689 Posts |
Posted - 2007-08-27 : 08:38:29
|
quote: Originally posted by spirit1 acctually it is a subquery._______________________________________________Causing trouble since 1980blog: http://weblogs.sqlteam.com/mladenp
BOL defines Subquery as "A subquery is a query that is nested inside a SELECT, INSERT, UPDATE, or DELETE statement, or inside another subquery"http://msdn2.microsoft.com/en-us/library/ms189575.aspx KH[spoiler]Time is always against us[/spoiler] |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2007-08-27 : 09:03:12
|
so a correlated subquery is just a version of a subquery.i always distinguished the two as:subquery can be run standalone, correlated can't becaues it has a binding condition to an outer table in its where clause._______________________________________________Causing trouble since 1980blog: http://weblogs.sqlteam.com/mladenp |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-08-27 : 09:30:31
|
Lab test-- Ken (41% of batch)SELECT b.ID, b.Grp, b.ArticleFROM ( SELECT MAX(ID) AS ID, Grp FROM #Sample WHERE Article <> 0 GROUP BY Grp ) AS aINNER JOIN #Sample AS b ON a.ID = b.ID-- Barbie (42% of batch)SELECT b.ID, b.Grp, b.ArticleFROM #Sample AS bWHERE b.ID = (SELECT MAX(a.ID) FROM #Sample AS a WHERE a.Article <> 0 AND a.Grp = b.Grp)-- Peso (17% of batch)SELECT ID, Grp, ArticleFROM ( SELECT ID, Grp, Article, ROW_NUMBER() OVER (PARTITION BY Grp ORDER BY ID DESC) AS RecID FROM #Sample WHERE Article > 0 ) AS dWHERE RecID = 1 Execution plansSELECT b.ID, b.Grp, b.Article FROM ( SELECT MAX(ID) AS ID, Grp FROM #Sample WHERE Article <> 0 GROUP BY Grp ) AS a INNER JOIN #Sample AS b ON a.ID = b.ID -- Barbie |--Hash Match(Inner Join, HASH:([Expr1004])=([b].[ID]), RESIDUAL:([tempdb].[dbo].[#Sample].[ID] as [b].[ID]=[Expr1004])) |--Stream Aggregate(GROUP BY:([tempdb].[dbo].[#Sample].[Grp]) DEFINE:([Expr1004]=MAX([tempdb].[dbo].[#Sample].[ID]))) | |--Sort(ORDER BY:([tempdb].[dbo].[#Sample].[Grp] ASC)) | |--Table Scan(OBJECT:([tempdb].[dbo].[#Sample]), WHERE:([tempdb].[dbo].[#Sample].[Article]<>(0))) |--Table Scan(OBJECT:([tempdb].[dbo].[#Sample] AS [b]))SELECT b.ID, b.Grp, b.Article FROM #Sample AS b WHERE b.ID = (SELECT MAX(a.ID) FROM #Sample AS a WHERE a.Article <> 0 AND a.Grp = b.Grp) -- Peso |--Hash Match(Inner Join, HASH:([Expr1006], [a].[Grp])=([b].[ID], [b].[Grp]), RESIDUAL:([Expr1006]=[tempdb].[dbo].[#Sample].[ID] as [b].[ID] AND [tempdb].[dbo].[#Sample].[Grp] as [a].[Grp]=[tempdb].[dbo].[#Sample].[Grp] as [b].[Grp])) |--Stream Aggregate(GROUP BY:([a].[Grp]) DEFINE:([Expr1006]=MAX([tempdb].[dbo].[#Sample].[ID] as [a].[ID]))) | |--Sort(ORDER BY:([a].[Grp] ASC)) | |--Table Scan(OBJECT:([tempdb].[dbo].[#Sample] AS [a]), WHERE:([tempdb].[dbo].[#Sample].[Article] as [a].[Article]<>(0))) |--Table Scan(OBJECT:([tempdb].[dbo].[#Sample] AS [b]))SELECT ID, Grp, Article FROM ( SELECT ID, Grp, Article, ROW_NUMBER() OVER (PARTITION BY Grp ORDER BY ID DESC) AS RecID FROM #Sample WHERE Article > 0 ) AS d WHERE RecID = 1 |--Filter(WHERE:([Expr1004]=(1))) |--Sequence Project(DEFINE:([Expr1004]=row_number)) |--Compute Scalar(DEFINE:([Expr1006]=(1))) |--Segment |--Sort(ORDER BY:([tempdb].[dbo].[#Sample].[Grp] ASC, [tempdb].[dbo].[#Sample].[ID] DESC)) |--Table Scan(OBJECT:([tempdb].[dbo].[#Sample]), WHERE:([tempdb].[dbo].[#Sample].[Article]>(0))) Profiler CPU Duration Reads WritesKen 0 1 6 0Barbie 0 1 6 0Peso 0 0 3 0 E 12°55'05.25"N 56°04'39.16" |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2007-08-27 : 09:41:52
|
well i'm sorry to say that yours won't run Peter especially since we're dealing with sql server 2000 _______________________________________________Causing trouble since 1980blog: http://weblogs.sqlteam.com/mladenp |
 |
|
khtan
In (Som, Ni, Yak)
17689 Posts |
Posted - 2007-08-27 : 09:48:29
|
So's Ken's solution is the best for SQL Server 2000By the way, who is Ken ? KH[spoiler]Time is always against us[/spoiler] |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-08-27 : 09:49:25
|
I know. I just put it there for comparison.The real piece of information is that there is no difference between Ken and Barbie queries. E 12°55'05.25"N 56°04'39.16" |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-08-27 : 09:50:47
|
quote: Originally posted by khtan So's Ken's solution is the best for SQL Server 2000
No difference, I think the percentage is rounded.If you check the execution plans, there is no difference. E 12°55'05.25"N 56°04'39.16" |
 |
|
ken218
Starting Member
4 Posts |
Posted - 2007-08-27 : 12:56:50
|
I am Ken :p thanks guys for all the information. Well, if you google "subquery join" then you see a lot of posts saying subqueries are evil, and shoule be avoided. But the only examples I can find are things like--change fromselect * from table where id in (select id from another_table)--toselect table.* from table join another_table on table.id = another_table.idwhich I think some SQL implementations already do before your queries are executed, and also, are easy to get around with. However, for this exercise that I have, I couldn't find a way to do it without using nested/sub query.for the barbie query... I never tried to do something like that, it looks more complicated to me than the ken query ;) maybe I should practice that sometime :Dso, back to the performance side, what is the real performance killer in SQL? I know cursor is a big NO NO, but what else is bad?again, thanks for all the information! |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2007-08-27 : 13:01:47
|
- correlated queries in the select list: select id, col1, (select col2 from t2 where t1.id = t2.id) as col2 from t1- anything that runs reads through the roof.- no indexes- etc..._______________________________________________Causing trouble since 1980blog: http://weblogs.sqlteam.com/mladenp |
 |
|
ken218
Starting Member
4 Posts |
Posted - 2007-08-27 : 13:18:29
|
I was checking the runtime for barbie query and ken query, ken seem to run a fraction of second (under 1ms) faster than barbie. I don't know if the difference will increase as the table grows bigger. However, I thought in the barbie query, for every row pulled from the table, "SELECT MAX(a.ID) FROM #Sample AS a WHERE a.Article <> 0 AND a.Grp = b.Grp" query will be executed, which means it actually ran for 18 times during the execution time. guess I was very wrong >"< |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-08-27 : 13:20:03
|
quote: Originally posted by ken218 --change fromselect * from table where id in (select id from another_table)--toselect table.* from table join another_table on table.id = another_table.id
It is not the subquery that is bad here, it is the IN part. E 12°55'05.25"N 56°04'39.16" |
 |
|
ken218
Starting Member
4 Posts |
Posted - 2007-08-27 : 13:26:39
|
I just realised the forum system doesn't do word wrap... eish... sorry about needing to scroll to the right all the time!!I'll watch it next time I post >"< |
 |
|
madhivanan
Premature Yak Congratulator
22864 Posts |
Posted - 2007-08-28 : 02:00:25
|
quote: Originally posted by spirit1 well i'm sorry to say that yours won't run Peter especially since we're dealing with sql server 2000 _______________________________________________Causing trouble since 1980blog: http://weblogs.sqlteam.com/mladenp
It seems Peso is too much addicted to 2005 MadhivananFailing to plan is Planning to fail |
 |
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2007-08-28 : 07:13:59
|
quote: Well, if you google "subquery join" then you see a lot of posts saying subqueries are evil, and shoule be avoided.
Avoid correlated subqueries? sometimes.Avoid derived table subqueries? Don't even bother using a database at all, then ... they are crucial to use when writing correct and efficient SQL.I talk a bit about these topics here:http://weblogs.sqlteam.com/jeffs/archive/2007/04/30/60192.aspxhttp://weblogs.sqlteam.com/jeffs/archive/2007/07/12/60254.aspx- Jeffhttp://weblogs.sqlteam.com/JeffS |
 |
|
hankswart
Starting Member
5 Posts |
Posted - 2007-08-28 : 11:18:24
|
Hi, I might be missing the point here but isn't the query as simple as:DECLARE @TEMP TABLE ([ID] INT, [GRP] INT, [ARTICLE] INT)INSERT INTO @TEMP VALUES (1, 1, 0)INSERT INTO @TEMP VALUES (2, 1, 0)INSERT INTO @TEMP VALUES (3, 1, 0)INSERT INTO @TEMP VALUES (4, 1, 3)INSERT INTO @TEMP VALUES (5, 1, 2)INSERT INTO @TEMP VALUES (6, 1, 0)INSERT INTO @TEMP VALUES (7, 2, 0)INSERT INTO @TEMP VALUES (8, 2, 3)INSERT INTO @TEMP VALUES (9, 2, 1)INSERT INTO @TEMP VALUES (10, 2, 0)INSERT INTO @TEMP VALUES (11, 2, 0)INSERT INTO @TEMP VALUES (12, 2, 0)INSERT INTO @TEMP VALUES (13, 3, 5)INSERT INTO @TEMP VALUES (14, 3, 2)INSERT INTO @TEMP VALUES (15, 3, 0)INSERT INTO @TEMP VALUES (16, 3, 0)INSERT INTO @TEMP VALUES (17, 3, 1)INSERT INTO @TEMP VALUES (18, 3, 1)SELECT MAX(ID) AS [ID], [GRP], [ARTICLE] FROM @TEMPGROUP BY [GRP], [ARTICLE]HAVING ([ARTICLE] != 0)ORDER BY [ID], [GRP], [ARTICLE] |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-08-28 : 11:20:17
|
quote: Originally posted by hankswart SELECT MAX(ID) AS [ID], [GRP], [ARTICLE] FROM @TEMPGROUP BY [GRP], [ARTICLE]HAVING ([ARTICLE] != 0)ORDER BY [ID], [GRP], [ARTICLE]
When you wrote and executed the query, you must have noticed the output?ID GRP ARTICLE4 1 35 1 28 2 39 2 113 3 514 3 218 3 1 E 12°55'05.25"N 56°04'39.16" |
 |
|
hankswart
Starting Member
5 Posts |
Posted - 2007-08-28 : 12:08:06
|
quote: Originally posted by Peso [quote]Originally posted by hankswart SELECT MAX(ID) AS [ID], [GRP], [ARTICLE] FROM @TEMPGROUP BY [GRP], [ARTICLE]HAVING ([ARTICLE] != 0)ORDER BY [ID], [GRP], [ARTICLE]
When you wrote and executed the query, you must have noticed the output?ID GRP ARTICLE4 1 35 1 28 2 39 2 113 3 514 3 218 3 1 Yes, I did and the problem with this is??? 4 is the maximum for the group 1 and article 3, 5 is the maximum for group 1 and article 2, etc... 0 is excluded...If you are implying that the retured result should be 5,1,2 and 5,1,3 what's the use of using the id column? |
 |
|
Next Page
|
|
|
|
|