ROW_NUMBER vs GROUP BY - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Transact-SQL (2005)

ROW_NUMBER vs GROUP BY

Author

Topic

johnconstraint
Starting Member

23 Posts

Posted - 2013-01-16 : 13:50:25

I have a table with sample data shown below. I want to select only one row from the first two rows having identical Office_ID, similarly one row from 3rd and 4th rows. It doesn't matter which one is selected, all I need is just one row. All the five columns in the table constitute the primary key.


Date         ID           Office_ID          Office_ID2        Source
---------------------------------------------------------------------
20120930     12           123456789          7                 abd
20120930     12           123456789          7                 efg
20120930     12           901123456          7                 hik
20120930     12           901123456          7                 lmn

I was able to get the desired result using CTE as shown below. But the performance seems to be slow.


; WITH cte_tble AS (
SELECT Date,
ID,
Office_ID,
Office_ID2,
Source,
ROW_NUMBER () OVER (PARTITION BY Date, ID, Office_ID ORDER BY Source) AS Seq
FROM Table
)
select * from cte_tble 
where Seq =1 
and date_id = 20120930

So I am wondering if there is a better way of doing this, may be using Group by? Any help is appreciated.
I want to join the result set from the above selection to another table to get the final result.

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-01-16 : 14:04:34

Here are my thoughts - but it is done with no data or testing to back them up, so.....

One possibility is to use an aggregate function as shown below. But, this has the problem that Office_ID2 may be picked from one row and Source might be picked from another row. If that is a problem, this would not work

SELECT Date,
	ID,
	Office_ID,
	MAX(Office_ID2) AS Office_ID2,
	MAX(Source) AS Source
FROM
	Tbl
GROUP BY
	Date,
	ID,
	Office_ID

Another possiblity which may speed up the subsequent queries that use this result set is to create a temp table as shown below and insert the data into that table (using your approach, or the one above) and then join with that temp table.


CREATE TABLE #data(Date DATETIME,
	ID INT NOT NULL,
	Office_ID INT NOT NULL,
	Office_ID2 INT NOT NULL ,
	Source VARCHAR(32)
	PRIMARY KEY CLUSTERED (Id, Office_ID, Office_ID2)
);

johnconstraint
Starting Member

23 Posts

Posted - 2013-01-16 : 15:09:28

Thanks so much James K! I was able to make use of the first query that you provided.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources