Grouping Data and Assigning an ID

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

Transact-SQL (2008)

Grouping Data and Assigning an ID

Author

Topic

cstokes91
Yak Posting Veteran

72 Posts

Posted - 2013-04-08 : 17:30:47

Hey guys,

I am currently trying to group some data and assign it an ID... I am basically trying to group the data then add use the column called JRNID as a new ID to assign that shows that all the people with the same fname, lname, dob, sex, and race are assumed to be the same person. I am able to group the data, I just don't know how to create a new ID and update the table once it is grouped.

Here is the table structure:

MNIIDCreate


MNINo varchar(15)
,Booking# varchar(15)
,BookDate datetime
,JRN varchar(10)
,DOB datetime
,race varchar(1)
,sex varchar(1)
,fname varchar(15)
,lname varchar(20)
,BookingNo varchar(15)
,JRNID int

Here is my code for the grouping


SELECT FNAME, SEX, lname, dob, race, COUNT(*) as recordCount
FROM PARKDATA.DBO.MNIIDCREATE
group by lname, FNAME, SEX, dob, race 
order by recordCount desc

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-04-08 : 17:49:54

You can use row_Number function and a cte to do this:

;with cte as
(
	select
		ROW_NUMBER() over (PARTITION by FNAME, SEX, lname, dob, race
			order by FNAME, SEX, lname, dob, race) as Grp,
		JRNID
	from
		PARKDATA.DBO.MNIIDCREATE
)
update cte set JRNID = Grp;

cstokes91
Yak Posting Veteran

72 Posts

Posted - 2013-04-08 : 18:09:51

Hey James,

I looked into that but the problem I am still running into is that it is not assigning a unique ID to each person... I probably wasn't as specific as I could have been in my OP.

I want all the records grouped that have the same fname, lname, sex, dob, and race then given a unique ID that no other record has other than if it has an exact match to all of those criteria.

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-04-08 : 18:16:01

quote:
Originally posted by cstokes91

Hey James,

I looked into that but the problem I am still running into is that it is not assigning a unique ID to each person... I probably wasn't as specific as I could have been in my OP.

I want all the records grouped that have the same fname, lname, sex, dob, and race then given a unique ID that no other record has other than if it has an exact match to all of those criteria.

That is what the query I posted is supposed to do. See the example below. If it is not doing what you want it to do, can you post some sample data in a consumable format like I have done?

create table #tmp(fname varchar(32), lname varchar(32), sex char(1), dob datetime, race char(1), jrnid int);
insert into #tmp values('a','b','m','19900101','c',null),
('x','b','m','19900101','c',null),
('x','b','m','19900101','c',null),
('x','z','f','19900101','a',null),
('x','z','f','19900101','a',null);

select * from #tmp;

;with cte as
(
	select
		ROW_NUMBER() over (PARTITION by FNAME, SEX, lname, dob, race
			order by FNAME, SEX, lname, dob, race) as Grp,
		JRNID
	from
		#tmp
)
update cte set JRNID = Grp;


select * from #tmp;

drop table #tmp;

cstokes91
Yak Posting Veteran

72 Posts

Posted - 2013-04-09 : 09:57:08

On the sample data that you provided it has 3 different records with the JRNID of 1. On this example I would want the record with the first name 'a' to be 1, the first name 'x' with last name 'b' to be 2 and the first name 'x' last name 'z' to be 3.


fname	lname	sex	dob	               race   jrnid
a	b	m	1990-01-01 00:00:00.000	c	1
x	b	m	1990-01-01 00:00:00.000	c	1
x	b	m	1990-01-01 00:00:00.000	c	2
x	z	f	1990-01-01 00:00:00.000	a	1
x	z	f	1990-01-01 00:00:00.000	a	2

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-04-09 : 10:10:26

Sorry I misunderstood what you were trying to do. Can you try the following:

create table #tmp(fname varchar(32), lname varchar(32), sex char(1), dob datetime, race char(1), jrnid int);
insert into #tmp values('a','b','m','19900101','c',null),
('x','b','m','19900101','c',null),
('x','b','m','19900101','c',null),
('x','z','f','19900101','a',null),
('x','z','f','19900101','a',null);

select * from #tmp;
;with cte as
(
	select
		ROW_NUMBER() over (order by FNAME, lname, SEX, dob, race) -
		ROW_NUMBER() over (PARTITION by FNAME, lname, SEX, dob, race
			order by FNAME, SEX, lname, dob, race) as Grp,
		JRNID
	from
		#tmp
),
cte2 as
( select *, DENSE_RANK() Over( order by Grp) as Grp2 from cte)
update cte2 set JRNID = Grp2;

select * from #tmp;
drop table #tmp;

cstokes91
Yak Posting Veteran

72 Posts

Posted - 2013-04-09 : 10:36:31

That is perfect! It kind of blows my mind a bit though... Do you mind explaining the logic behind it? I know what the row_number can be used for and I know how the dense_rank can be used but how did you link it all together?

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-04-09 : 10:57:04

Glad it worked out - I was probably making it more complicated than necessary because I was trying to adapt my original incorrect query to meet your requirements. The following may be simpler and more efficient if you don't have any nulll values in the five fields. Both queries are doing essentially the same thing, i.e., first find a group number and then update the table with that group number

create table #tmp(fname varchar(32), lname varchar(32), sex char(1), dob datetime, race char(1), jrnid int);
insert into #tmp values('a','b','m','19900101','c',null),
('x','b','m','19900101','c',null),
('x','b','m','19900101','c',null),
('x','z','f','19900101','a',null),
('x','z','f','19900101','a',null);

select * from #tmp;

;with cte as
(
	select FNAME, lname, SEX, dob, race,
	DENSE_RANK() over ( order by FNAME, lname, SEX, dob, race) as Grp
	from #tmp
	group by FNAME, lname, SEX, dob, race
)
update t set
	jrnid = Grp
from
	#tmp t inner join cte c on 
		c.fname = t.fname
		and c.lname = t.lname
		and c.sex = t.sex
		and c.dob = t.dob
		and c.race = t.race;
select * from #tmp;

drop table #tmp;

cstokes91
Yak Posting Veteran

72 Posts

Posted - 2013-04-09 : 16:50:59

Unfortunately, the data has a lot of blanks, nulls and other errors in it so the first code works better for what I am looking for.

Thanks for the help as always though... I'll just stick to the first code... I have more or less of an understanding of how it works.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources