Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Random with a twist

Author  Topic 

pixelwiz
Starting Member

25 Posts

Posted - 2011-06-08 : 10:30:16
Hi, I like how you can randomize data with doing an ORDER BY NEWID().

However, I have a little problem. Let's say I have 300 people. I need to randomly group them them into sets of 3, but a set of 3 can't have people from the same state in it.

I can actually place them into sets of 3 as a separate process, but I still need to ensure that no one from the same state is in a particular set somehow.

The only way I can think of doing it right now is with a loop after I get a random result back, but there must be a better and faster way.

Thanks.

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2011-06-08 : 10:37:25
How random does it have to be?

For example: could person1 be from 'State List 1' and Person 2 from 'State List 2'... etc?

Corey

I Has Returned!!
Go to Top of Page

pixelwiz
Starting Member

25 Posts

Posted - 2011-06-08 : 10:40:49
I suppose it's ok, but it would be good if people in each state list were also randomly ordered.
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2011-06-08 : 12:32:25
Wow... this was kinda tough. Maybe some of these other gurus can come up with something better...

id = person
st = state


Declare @fgr table (
id int,
fgr int
)

Declare @t table (
id int identity(1,1),
st int,
gr varchar(100)
)

Insert Into @t
Select top 300
st = number%50+1,
gr=''
From master..spt_values
Where number >0
Order By newid()

Declare @rowCount int
set @rowCount = 1

while exists(Select * From @t Where gr = '')
Begin
while (@rowCount > 0)
Begin
Update A
Set
gr = convert(varchar,grstr%3+1)+A.gr
From @t A
Inner Join
(
Select
id,
st,
grstr = Row_Number() Over(Partition By gr, st Order By NewId()),
grstm = (count(*) Over(Partition By gr, st)/3)*3
From @t Z
Where gr not like 'f%'
and 3 < (Select count(*) From @t Where gr = Z.gr)
) B
On A.id = B.id
Where gr not like 'f%'
and B.grstr <= grstm

Set @rowCount= @@RowCount
End

if not exists(Select * From @t Where gr not like 'f%' and gr <> '')
Begin
if (9 > (Select count(*) From @t Where gr not like 'f%'))
Begin
Update A
Set gr = 1
From @t A
Where gr not like 'f%'
End
Else
Begin
Update A
Set
gr = B.gr
From @t A
Inner Join
(
Select
id,
st,
gr = convert(varchar,Row_Number() Over(Order By st, newid())%3+1)
From @t Where gr not like 'f%'
) B
On A.id = B.id
End
End

Insert Into @fgr
Select
id,
fgr = Dense_Rank() Over(Order By case when gr like 'f%' then 0 else 1 end, gr, gr2)
From
(
Select
*,
gr2 = (grstr-1)/3*3
From
(
Select
id,
st,
gr,
grstr = Row_Number() Over(Partition By gr Order By NewId()),
grstm = (count(*) Over(Partition By gr)/3)*3
From @t
where gr <> ''
) A
Where grstr <= grstm
) Z


Update A
Set
gr = isnull('f'+right(replicate('0',5)+convert(varchar,fgr),5),'')
From @t A
Left Join @fgr B
On A.id = B.id
Where A.gr not like 'f%'


--Select * From @t Order By gr
End

Select gr, count(*) From @t Group By gr Order By gr

Select
id1 = B.id,
st1 = B.st,
id2 = A.id,
st2 = A.st,
id3 = C.id,
st3 = C.st,
StateCondition = Case when A.st = B.st or A.st = C.st or B.st = C.st then 'Fail' else '' end
From @t A
Inner Join @t B
On A.gr = B.gr
and A.id > B.id
Inner Join @t C
On A.gr = C.gr
and A.id < C.id


Corey

I Has Returned!!
Go to Top of Page

pixelwiz
Starting Member

25 Posts

Posted - 2011-06-08 : 12:56:32
Wow, now that's impressive. I was really hoping there was a simple way.
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2011-06-08 : 13:20:50
Not quite there yet... the random sample just happened to pass.

Re-runs are failing.

Corey

I Has Returned!!
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2011-06-08 : 15:05:06
Ok... fixed I think.

I was overthinking it.


Declare @t table (
id int identity(1,1),
st int,
gr int
)

Insert Into @t
Select top 300
st = number%50+1,
gr=null
From master..spt_values
Where number>0 and number<=1000
Order By newid()

Declare @grpSize int,
@grpCount int

Set @grpSize = 3
Set @grpCount = (Select count(*) From @t)/@grpSize


Update A
Set
gr = B.gr
From @t A
Inner Join
(
Select
Z.id,
Z.st,
gr = Row_Number()Over(Order By Y.o, newid())%@grpCount+1
From @t Z
Inner Join
(
Select
st,
o = Row_Number()Over(Order By count(*) Desc, newid())
From @t
Group By st
) Y
On Z.st = Y.st
) B
On A.id = B.id

Select
A.gr,
id1 = B.id,
st1 = B.st,
id2 = A.id,
st2 = A.st,
id3 = C.id,
st3 = C.st,
StateCondition = Case when A.st = B.st or A.st = C.st or B.st = C.st then 'Fail' else '' end
From @t A
Inner Join @t B
On A.gr = B.gr
and A.id > B.id
Inner Join @t C
On A.gr = C.gr
and A.id < C.id
Where A.gr <> ''
Order By 8 Desc, 1


EDIT: This isn't TRULY random... though it might be random enough. I don't know.

Corey

I Has Returned!!
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-06-10 : 11:09:12
How is "the set of 3" defined? As 3 columns or 3 rows?

a) Group 1 - State 1 - State 2 - State 3
Group 2 - State 6 - State 5 - State 9

b) Group 1 - State 1
Group 1 - State 2
Group 1 - State 3
Group 2 - State 6
Group 2 - State 5
Group 2 - State 9



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-06-10 : 11:38:50
[code]-- Create sample data
DECLARE @Pool TABLE
(
PlayerID INT IDENTITY(1, 1) PRIMARY KEY,
StateCode TINYINT NOT NULL
)

-- Populate sample data
INSERT @Pool
(
StateCode
)
SELECT 1 + ABS(CHECKSUM(NEWID())) % 50 AS Area
FROM master..spt_values
WHERE type = 'p'
AND number BETWEEN 1 AND 300

-- Solution starts here
DECLARE @Factor INT = (SELECT CEILING(COUNT(*) / 3E) FROM @Pool)

;WITH ctePool
AS (
SELECT PlayerID,
StateCode,
COUNT(*) OVER (PARTITION BY StateCode) AS PlayerCount
FROM @Pool
), cteAssigned
AS (
SELECT ROW_NUMBER() OVER (ORDER BY PlayerCount DESC, StateCode, NEWID()) -1 AS SeqID,
PlayerID,
StateCode
FROM ctePool
)
SELECT 1 + SeqID % @Factor AS theGroup,
PlayerID,
StateCode
FROM cteAssigned[/code]


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2011-06-10 : 13:57:46
Very nice Peso... Kindof the approach I took... just implemented better

Corey

I Has Returned!!
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-06-10 : 17:40:14
This fails most often between iteration 20-25. Anyone can spot why?
SET NOCOUNT ON

-- Create sample data
CREATE TABLE #Pool
(
PlayerID INT IDENTITY(1, 1) PRIMARY KEY,
StateCode TINYINT NOT NULL,
GroupNumber TINYINT
)

-- Populate sample data
INSERT #Pool
(
StateCode
)
SELECT 1 + ABS(CHECKSUM(NEWID())) % 50 AS StateCode
FROM master..spt_values
WHERE type = 'p'
AND number BETWEEN 1 AND 300

CREATE NONCLUSTERED INDEX IX_Pool ON #Pool (StateCode) INCLUDE (GroupNumber)

/*
Solution starts here
*/
-- Mimic user supplied parameter
DECLARE @GroupSize TINYINT = 3

-- Declare working variables
DECLARE @WantedGroups TINYINT,
@CurrCode TINYINT = (SELECT MIN(StateCode) FROM #Pool)

;WITH cteWantedGroups
AS (
SELECT NTILE(@GroupSize) OVER (ORDER BY PlayerID) AS SeqID
FROM #Pool
)
SELECT TOP(1) @WantedGroups = COUNT(*)
FROM cteWantedGroups
GROUP BY SeqID
ORDER BY COUNT(*) DESC

CREATE TABLE #Hat
(
SeqID INT IDENTITY(1, 1) PRIMARY KEY,
Number TINYINT NOT NULL
)

WHILE EXISTS (SELECT * FROM #Pool WHERE GroupNumber IS NULL)
BEGIN
TRUNCATE TABLE #Hat

INSERT #Hat
(
Number
)
SELECT Number
FROM (
SELECT Number
FROM master..spt_values
WHERE Type = 'P'
AND Number BETWEEN 1 AND @WantedGroups

EXCEPT

SELECT GroupNumber
FROM #Pool
WHERE GroupNumber IS NOT NULL
GROUP BY GroupNumber
HAVING COUNT(*) = @GroupSize
) AS d
ORDER BY NEWID()

UPDATE p
SET p.GroupNumber = h.Number
FROM (
SELECT GroupNumber,
ROW_NUMBER() OVER (ORDER BY PlayerID) AS SeqID
FROM #Pool
WHERE StateCode = @CurrCode
AND GroupNumber IS NULL
) AS p
INNER JOIN #Hat AS h ON h.SeqID = p.SeqID

SELECT @CurrCode = MIN(StateCode)
FROM #Pool
WHERE GroupNumber IS NULL
END

SELECT *
FROM #Pool
ORDER BY GroupNumber

DROP TABLE #Pool,
#Hat



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2011-06-10 : 19:43:53
not sure but maybe this?


SELECT DISTINCT 1 + ABS(CHECKSUM(NEWID())) % 50 AS StateCode
FROM master..spt_values
WHERE type = 'p'
AND number BETWEEN 1 AND 300

CREATE UNIQUE NONCLUSTERED INDEX IX_Pool ON #Pool (StateCode) INCLUDE (GroupNumber)


If you don't have the passion to help people, you have no passion
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-06-11 : 00:27:20
No. I want duplicate StateCode values to test the uniqueness in each loop.



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2011-06-13 : 23:16:39
Peso

Is the iteration happening in

WHILE EXISTS (SELECT * FROM #Pool WHERE GroupNumber IS NULL)

I assume?

If you don't have the passion to help people, you have no passion
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-06-14 : 00:07:50
Yes, as long as there are unassigned rows, keep iterating.


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page
   

- Advertisement -