Author |
Topic  |
|
taylo
Yak Posting Veteran
USA
82 Posts |
Posted - 06/12/2001 : 14:36:08
|
Is there an SQL command which is the opposite of Distinct that will return all duplicate records?
I want to return all records from the database that have duplicate first names.
Thanks,
Rob
|
|
JustinBigelow
SQL Gigolo
USA
1157 Posts |
Posted - 06/12/2001 : 14:47:51
|
This should work
select c1, count(c1) from table1 group by c1 having count(c1) >= 2
Justin
|
 |
|
graz
Chief SQLTeam Crack Dealer
USA
4149 Posts |
Posted - 06/12/2001 : 18:12:04
|
This article (http://www.sqlteam.com/item.asp?ItemID=3331) talks about Deleting Duplicates. You can use what it says to identify the duplicates.
=============================================== Creating tomorrow's legacy systems today. One crisis at a time. |
 |
|
nr
SQLTeam MVY
United Kingdom
12543 Posts |
Posted - 06/12/2001 : 18:50:44
|
count(*) used to be more efficient than count(col) or count(1) - don't know if it makes a difference now just seems more natural so - and why waste an = sign.
select c1, count(*) from table1 group by c1 having count(*) > 1
|
 |
|
Merkin
Funky Drop Bear Fearing SQL Dude!
Australia
4970 Posts |
Posted - 06/12/2001 : 19:18:10
|
Picky picky picky 
Damian |
 |
|
nr
SQLTeam MVY
United Kingdom
12543 Posts |
Posted - 06/12/2001 : 19:27:29
|
yes, yes, yes (it's my job ).
Must be less work out there - I dont seem to be getting job candidates who don't know that deleting dup recs isn't trivial any more.
Well you just delete them was the best.
And they seem to know what a clustered index is now too (although not when to use them).
|
 |
|
JustinBigelow
SQL Gigolo
USA
1157 Posts |
Posted - 06/12/2001 : 23:04:40
|
quote:
count(*) used to be more efficient than count(col) or count(1) - don't know if it makes a difference now just seems more natural so -
Theres no pleasing some people is there? hehe. But it did get me wondering, count(*) and count(1) on the orders table of the northwind sample database produced identical I/O and CPU costs. But it is a pretty small table relatively speaking, maybe somebody with access to a bigger db can find out.
quote:
and why waste an = sign.
Good point, I just got my newsletter from the "= sign conservation group" and this years crop of = signs is looking pretty slim. I'll try and be more frugal with my ='s in the future. Just joking with you man 
Justin
|
 |
|
nr
SQLTeam MVY
United Kingdom
12543 Posts |
Posted - 06/13/2001 : 08:35:25
|
The reason for count(*) being faster than count(col) was that the server didn't have to resolve the column and also could use any indexes it wanted to without worrying about data pages.
I suspect that count(1) wouldn't be much different but just doesn't look as natural.
If you overuse = it gets tired, slows down and sometimes doesn't even turn up for the operation - then you have to turn the server off to let it rest for a while. A lot of people don't know about that .
|
 |
|
dtong004
Constraint Violating Yak Guru
USA
281 Posts |
Posted - 06/13/2001 : 15:12:59
|
I have different impression that yours.
count(*) need to look at all the columns and count(1) or count(col) just look one column.
In Oracle, count(1) is recommend to use than count(*).
In MS-sql, Microsoft says count(*) and count(1) is same.
Daniel
|
 |
|
JustinBigelow
SQL Gigolo
USA
1157 Posts |
Posted - 06/13/2001 : 18:10:31
|
Did a little research on this. First of all I couldn't find anything comparing the merits of the two methods in BOL. Second if you do a count(ordinal) then sql server will implicitly convert it to a count(*), at least if I read the showplan results correctly. If you do a count(<column name>) it will convert to count(*) only if there is an index on the field; if there is no index then it does a count(<column name>) as expected. The CPU cost for all three appears to be identical; however the I/O cost for a named column was significantly higher than either count(*) or count(ordinal) -those were the same naturally since count(ordinal) is implicitly converted to count(*).
Justin
|
 |
|
robvolk
Most Valuable Yak
USA
15732 Posts |
Posted - 06/13/2001 : 18:23:13
|
All I wanted to know was, if I return duplicate records, can I get a refund or am I stuck with store credit?
We need to have a contest for the best reinterpretation of a question, and the best answer to that reinterpretation 
|
 |
|
nr
SQLTeam MVY
United Kingdom
12543 Posts |
Posted - 06/13/2001 : 19:14:29
|
I want to return all records from the database that have duplicate first names. Not very interesting.
If you do a count(<column name>) it will convert to count(*) only if there is an index on the field; if there is no index then it does a count(<column name>) as expected. Interesting but expected
The CPU cost for all three appears to be identical Bit weird (probably due to timing granularity)
However the I/O cost for a named column was significantly higher than either count(*) or count(ordinal). Expected but good to have confirmed.
count(*) or count(ordinal) -those were the same naturally since count(ordinal) is implicitly converted to count(*). Didn't know that - it's why I'm here, to get other people to do my work for me - thanks JustinBigelow.
contest for the best reinterpretation of a question? Surely this was the question.
How many reads for this post?
Edited by - nr on 06/13/2001 19:18:37 |
 |
|
|
Topic  |
|