Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 very basic SQL question

Author  Topic 

timsmith
Starting Member

11 Posts

Posted - 2008-07-27 : 12:41:20
Hi,

This is probably a very basic newbie question...so apologies in advance.

As part of a text mining project, I have been given an SQL database. I will be taking the data from the tables and running some text mining on it. But before I can do that, I need to remove duplicate names from the table. For example, if I have a table(T) with three columns: S.NO., FirstName & LastName, then what SQL query would return the non duplicate rows i.e, rows where both the first name and last name are not the same. (I'm not concerned about which of the duplicates is still in the result)

I tried on the sql tutorial pages, but couldn't find an appropriate example. Any tips or links would be highly appreciated.

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2008-07-27 : 13:25:33
In SQL Server 2005:

select * from
(select
[S.NO],
FirstName,
LastName,
row_number() over (partition by FirstName, LastName order by FirstName, LastName) as rownum
from yourtable) as t
where rownum = 1

Webfred

There are 10 types of people in the world: Those who understand binary, and those who don't...
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-07-27 : 13:37:44
quote:
Originally posted by timsmith

Hi,

This is probably a very basic newbie question...so apologies in advance.

As part of a text mining project, I have been given an SQL database. I will be taking the data from the tables and running some text mining on it. But before I can do that, I need to remove duplicate names from the table. For example, if I have a table(T) with three columns: S.NO., FirstName & LastName, then what SQL query would return the non duplicate rows i.e, rows where both the first name and last name are not the same. (I'm not concerned about which of the duplicates is still in the result)

I tried on the sql tutorial pages, but couldn't find an appropriate example. Any tips or links would be highly appreciated.


if sql 2000
SELECT t.*
FROM T t
INNER JOIN (SELECT FirstName,LastName,MIN([S.NO]) as MinRec
FROM T
GROUP BY FirstName,LastName)tmp
ON tmp.FirstName=t.FirstName
AND tmp.LastName=t.LastName
AND tmp.MinRec=t.[[S.NO]]
Go to Top of Page

timsmith
Starting Member

11 Posts

Posted - 2008-07-27 : 14:14:59
Thanks, that worked!!
Go to Top of Page
   

- Advertisement -