# Reader Challenge #1 Solutions (Part I)

By Bill Graziano on 28 May 2001 | 3 Comments | Tags: Reader Challenges

Thanks to the 18 people that submitted 27 possible solutions to our first Reader Challenge. In this article I've broken down the first correct solution which was submitted by Michael Price. I'll detail the cleanest, fastest and most obtuse solutions in the next article.

First I'd like to thank everyone for entering. I'm glad that 18 of you found this challenging enough to take the time to submit solutions. I'm also amazed at the variety of solutions that were submitted. This article details the first correct solution I received. I know there are other solutions that are faster and simpler but this solution was first, it was correct and I liked it. I think Michael may also have submitted the fastest solution but you'll have to wait until the next article.

#### The Solution

Michael wrote in his email:

Hello,

I figured You Are Kidding (YAK!) in trying to get the results in one select statement, but here is what I came up with.

I went for the Single Select/speed challenge part, it might also win the obtuse award <g>.

Nice yak reference but no bonus points since it's not in the solution. Michael actually submitted two solutions in the same email. And he submitted BOTH of them within 1 hour and 25 minutes from when I posted the challenge.

Here is Michael's solution:

```Select Avg(Above) As Above, Count(*) As Below
From Samples, (
Select Count(*) As Above
From Samples
Where Ind_Ret > ( Select Avg(Ind_Ret) + (Stdev(Ind_Ret) * 3) From Samples )
) As AboveData
Where Ind_Ret < ( Select Avg(Ind_Ret) - (Stdev(Ind_Ret) * 3) From Samples )```

which returns this result set

```Above       Below
----------- -----------
7           4

(1 row(s) affected))```

which is the correct result. I've modified Michael's code slightly to use the AS keyword to make column naming more explicit. It doesn't change the functionality but makes it easier to explain. I liked this solution because it uses a subquery, a derived table and a "psuedo-cartesion join" in a way I've never seen before. It also gives me three topics for future articles. I'm going to break this down starting at the inside SELECT and working my way out.

#### The Subquery

The deepest SELECT statement is

`Select Avg(Ind_Ret) + (Stdev(Ind_Ret) * 3) From Samples `

If you hightlight this section of code and run it in Query Analyzer you get 582.485 back. This adds three times the standard deviation to the average and is the upper bound of the result set. Any value above this we want to count. Expanding out from this we get this piece of code

```Select Count(*) As Above
From Samples
Where Ind_Ret > ( Select Avg(Ind_Ret) + (Stdev(Ind_Ret) * 3) From Samples )```

Our first SELECT statement is now the subquery in the WHERE clause. Books Online says in Transact-SQL, a subquery can be substituted anywhere an expression can be used in SELECT, UPDATE, INSERT, and DELETE statements, except in an ORDER BY list. The result of this query is

```Above
-----------
7

(1 row(s) affected)```

#### The Derived Table

Adding a little more of the full query in and we get this statement

```Select Avg(Above) As Above
From  (
Select Count(*) As Above
From Samples
Where Ind_Ret > ( Select Avg(Ind_Ret) + (Stdev(Ind_Ret) * 3) From Samples )
) As AboveData```

Our SELECT statement from above is now a derived table. We are replacing a table name in a query with a SELECT statement that returns a "virtual" table. You can run this with either Avg(Above) As Above or Above in the outer SELECT statement. You can change the code and see that it runs either way and they return the same result set. I'll tell you why in a second.

#### The Other Half

The other half of the result comes from the remainder of the query. If I strip out the part we've already discussed it leaves us with

```Select Count(*) As Below
From Samples
Where Ind_Ret < ( Select Avg(Ind_Ret) - (Stdev(Ind_Ret) * 3) From Samples )```

and that returns this result set

```Below
-----------
4

(1 row(s) affected)```

That looks an awful lot like what I just covered so I won't go over it again.

#### The Cartesian Join

To get both values in the same result set, Michael just "stuck" the two tables (one actual table and one derived table) together in a cartesian join. This joins each row in the first table to each row in the second table. In this case there's only one row in each table so it works out perfectly. Since one field is an aggregate (COUNT), the other also has to be an aggregate (or he has to use a group by). So he takes the average of a single value (seven) which is the value itself. Pretty clever. In cases where I've needed to do this I've always had each field be a subquery (which another reader submitted).

#### Summary

This solution generates four table scans. That probably won't win any performance awards. For the record, his second solution (in the same email) reduced this to three table scans. And he reduced it to two tables scans the following morning. You'll have to wait for the next article to see if anyone did better.

Thanks for reading Michael and thanks for taking the challenge. If I had any prizes I'd gladly pass one your way. You'll have to settle for the fame and glory of submitting the first correct answer to the first SQLTeam.com Reader Challenge.

## Related Articles

Another German Yak ... with a suprise (RC #3) (16 July 2002)

Das Yak ist Deutsch (RC #3) (14 July 2002)

Reader Challenge #3: Find the Yak! (22 March 2002)

Reader Challenge #2 Solutions (29 October 2001)

Reader Challenge #2 (CLOSED) (10 October 2001)

Reader Challenge #1 Solutions (Part II) (4 June 2001)

Reader Challenge #1 (16 May 2001)

## Other Recent Forum Posts

Variables in Where Clause (20 Replies)

how to update 2 records on 1 field (20 Replies)

Reading XML in SQL (5 Replies)

Return a zero value in place of null (8 Replies)

## Subscribe to SQLTeam.com

Weekly SQL Server newsletter with articles, forum posts, and blog posts via email. Subscribers receive our white paper with performance tips for developers.