Fetching duplicate rows - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Transact-SQL (2005)

Fetching duplicate rows

Author

Topic

mrm23
Posting Yak Master

198 Posts

Posted - 2008-11-27 : 01:59:16

Hi,
I have a resource_allocation table which has both employee and project list.
I want to list only those employees who are allocated to more than one project.
here is the query i wrote:

select e.emp_seq_no,e.emp_name
from resource_allocation ra
inner join employee e
on e.emp_seq_no = ra.emp_seq_no
inner join project p
on p.prj_seq_no = ra.prj_seq_no
group by e.emp_seq_no having count(ra.prj_seq_no) > 1

But this gives list of all the employees. how to filter for only duplicate entries?

snSQL
Master Smack Fu Yak Hacker

1837 Posts

Posted - 2008-11-27 : 02:13:24

The query you gave won't even run because emp_name isn't listed in the GROUP BY, also there is no need to join the project table if you're only counting employee allocation to projects.

So this will do it

SELECT employee.emp_seq_no, employee.emp_name
FROM resource_allocation 
INNER JOIN employee 
    ON employee.emp_seq_no = resource_allocation.emp_seq_no
GROUP BY employee.emp_seq_no, employee.emp_name
HAVING count(*) > 1

lionofdezert
Aged Yak Warrior

885 Posts

Posted - 2008-11-27 : 02:18:08

Is it a good practice to use count(*), as per performance point of view ???

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-11-27 : 02:36:50

If COUNT is needed due to business rules, you have to use it.

E 12°55'05.63"
N 56°04'39.26"

snSQL
Master Smack Fu Yak Hacker

1837 Posts

Posted - 2008-11-27 : 03:35:08

COUNT(*) is better for performance than using COUNT(somecolumn) if you want to count all rows because COUNT(*) lets the query optimizer use whatever plan will be the fastest, whereas if you specify a column then it has to use that column and the plan may not be optimal. Also when you use a column name instead of * you run the risk of not counting all rows if the column you specify has NULLs in it.

Of course if you specifically only want to count the non-NULL values in a specific column then you must use the column name.

Note that using COUNT(*) is very different to using SELECT *, which you must NOT do because you return all columns and probably return some that you do not need.

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-11-27 : 04:38:02

you can use window functions of 2005 also for this

select emp_seq_no,emp_name
from
(
select e.emp_seq_no,e.emp_name,
count(ra.prj_seq_no) over (partition by e.emp_seq_no) as projcount
from resource_allocation ra 
inner join employee e 
on e.emp_seq_no = ra.emp_seq_no
inner join project p
on p.prj_seq_no = ra.prj_seq_no
)t
where t.projcount>1

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources