Find Nth maximum value in SQL Server

By Guest Authors on 11 April 2004 | 14 Comments | Tags: SELECT


This aritlce is written by Hariharan Velayuthan. He writes "There are several methods to find out the Nth maximum/minimum value using SQL. This article discusses on such method to find Nth maximum value from a desired table. This article is aimed at users who are in the beginner or intermediate level in SQL Server."

[Note: This article assumes that the reader is familiar with the T-SQL, joins and queries]

I have taken utmost effort to make this article easy to understand, but incase you are not clear with the concept, please raise up your concern and I’ll be more than happy to attend your doubts. All the examples discussed in this article uses Employee table. If you do not have this table, please use the following script to create it.

Use Pubs
Go

Create table Employee
(
Eid int,
Name varchar(10),
Salary money
)
Go

Insert into Employee values (1,'harry',3500)
Insert into Employee values (2,'jack',2500)
Insert into Employee values (3,'john',2500)
Insert into Employee values (4,'xavier',5500) 
Insert into Employee values (5,'steven',7500) 
Insert into Employee values (6,'susana',2400) 
Go

A simple query that can find the employee with the maximum salary, would be:

Select * from Employee where salary = (Select max(Salary) from Employee)

How does this query work?

The SQL Engine evaluates the inner most query and then moves to the next level (outer query). So, in the above example inner query i.e. Select max(Salary) from Employee is evaluated first. This query will return a value of 7500 (based on the sample data shown as above). This value is substituted in the outer query and it is evaluated as:

Select * from Employee where salary = (7500)

Returns:

Eid	Name	Salary
5	steven	7500

If the same syntax is applied to find out the 2nd or 3rd or 4th level of salary, the query would become bit complex to understand. See the example below:

Select * from Employee where salary = 
	 (Select max(Salary) from Employee where salary 
	< (Select max(Salary) from Employee where 
		Salary < (Select max(Salary) from Employee where 
			   Salary <…………………………………………… N

The above query would go on and on, depending on the level of salary that is to be determined. As mentioned earlier, the SQL Engine evaluates the inner most query first and moves the next outer level. One wouldn’t want to write such a big query just to find out this simple information.

The same result can be achieved with a simple syntax and easily understandable logic, by using a CORRELATED SUBQUERY. This article doesn’t explain about correlated sub-query as it is out of scope of this article. (You may want to take a quick look on CORRELATED SUBQUERY.) As a "Rule of Thumb" keep these points in mind, when you use a correlated sub-query

  1. Correlated sub-query is a performance overhead to the database server and so, you have to use it only if it is required
  2. Avoid using Correlated subquery on large tables, as the inner query is evaluated for each row of the outer query

Having said that, let’s look at the query that captures the Nth maximum value:

Select * From Employee E1 Where
    (N-1) = (Select Count(Distinct(E2.Salary)) From Employee E2 Where
		E2.Salary > E1.Salary)

(Where N is the level of Salary to be determined)

In the above example, the inner query uses a value of the outer query in its filter condition meaning; the inner query cannot be evaluated before evaluating the outer query. So each row in the outer query is evaluated first and the inner query is run for that row. Let’s look into the background process of this query, by substituting a value for N i.e. 4,(Idea is to find the 4th maximum salary):

Select * From Employee E1 Where
    (4-1) = (Select Count(Distinct(E2.Salary)) From Employee E2 Where
		E2.Salary > E1.Salary)

Since the outer query’s value is referred in the inner query, the operation is done row-by-row. Based on the sample data as shown above, the process starts with the following record:

	Employee E1
----------------------------------
Eid	Name	Salary
1	harry	3500

The salary of this record is substituted in the inner query and evaluated as:

Select Count(Distinct(E2.Salary)) From Employee E2 
	Where E2.Salary > 3500

Above query returns 2 (as there are only 2 salaries greater than 3500). This value is substituted in the outer query and will be evaluated as:

Select * From Employee E1 Where (4-1) = (2)

The "where" condition evaluates to FALSE and so, this record is NOT fetched in the result.

Next the SQL Engine processes the 2nd record which is:

	Employee E1
----------------------------------
Eid	Name	Salary
2	jack	2500

Now the inner query is evaluated as:

Select Count(Distinct(E2.Salary)) From Employee E2 
	Where E2.Salary > 2500

This query returns a value of 3 (as there are 3 salaries greater than 2500). The value is substituted in the outer query and evaluated as:

Select * From Employee E1 Where (4-1) = (3)

The "where" condition evaluates to TRUE and so, this record IS fetched in the result. This operation continues for all the remaining records. Finally the result shows these 2 records:

Eid	Name	Salary
2	jack	2500
3	john 	2500

The above query works in the same manner in Oracle and Sybase as well. Applying the same logic, to find out the first maximum salary the query would be:

Select * From Employee E1 Where
    (1-1) = (Select Count(Distinct(E2.Salary)) From Employee E2 Where
		E2.Salary > E1.Salary)

If you are able to understand this functionality, you can workout various other queries in the same manner. The bottom line is, the query should be efficient and NOT resource hungry.

Conclusion

This example is the simplest representation of Correlated sub-query. In the real-time database manipulation, correlated sub-queries will much more extensive. I would appreciate if you can share your comments and suggestions on this article. You can email me at harivhn@hotmail.com

Discuss this article: 14 Comments so far. Print this Article. This page has been read 122,923 times.

If you like this article you can sign up for our newsletter. We send it out each week that we post a new article. There's an opt-out link at the bottom of each newsletter so it's easy to unsubscribe at any time.

Email Address:

Email ThisSubscribe to this feedKick itSave to del.icio.usView blog reactions

Related Articles

Joining to the Next Sequential Row (2 April 2008)

Writing Outer Joins in T-SQL (11 February 2008)

How to Use GROUP BY with Distinct Aggregates and Derived tables (31 July 2007)

How to Use GROUP BY in SQL Server (30 July 2007)

SQL Server 2005: Using OVER() with Aggregate Functions (21 May 2007)

Server Side Paging using SQL Server 2005 (4 January 2007)

Using XQuery, New Large DataTypes, and More (9 May 2006)

Counting Parents and Children with Count Distinct (10 January 2006)

Other Recent Forum Posts

Internal Messaging System Structure (1 Reply)

Convert text to date (3 Replies)

for every value in a table.. (3 Replies)

GROUP BY Question (1 Reply)

Default settings (4 Replies)

Customized SqlFool script - Index Reorg/Rebuild (1 Reply)

SSIS Package Hangs (2 Replies)

Rollback information without KILL (3 Replies)

Subscribe to SQLTeam.com

Weekly SQL Server newsletter with articles, forum posts, and blog posts via email:

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

- Sponsor's Message -

SQLShare.com Videos

Writing to the Registry from TSQL Using XP_Instance_RegWrite

Reading from the registry to figure out a configuration issue or two is common, but writing to the registry isn't something you'll need to do very often. Still, it's a useful trick to know and we've put it to work showing you how to alter the default folder used for database backups.

Using XP_Instance_RegRead To Get Default Database Path

SQL saves more than a few things to the registry and we can leverage that by reading those settings from TSQL using xp_instance_regread. Easy to use and only one minor quirk to learn, and you'll reading from the registry. Look for the follow up video that demonstrates how to write changes to the registry too.

How to Change the Default Path for New Databases

Typing CREATE DATABASE MyDB gets the job done, but it puts the files in whatever location has been set as the default - which means you should make sure the default is set to a good place, or specify them as part of the CREATE. We'll show you how to adjust the setting (and we've got a video coming up that shows you how to change it from code too!).

Overview of Data Driven Subscriptions in Reporting Services

Subscriptions are a powerful feature of Reporting Services because they allow users to decide which reports to receive. But what do you do when the boss wants a report sent to a group of users, perhaps based on some additional criteria that frequently changes? That's where data driven subscriptions make sense and in our lesson today Devin does a nice overview of the options and requirements.

Using DatePart and DateName

Ever want to get the month out of a date as a number, or as literal text? Many people will do it by parsing the date as a string, but we've got some built in functions that will do it cleanly and consistently.