Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Random Results

Author  Topic 

vwilsonjr
Starting Member

45 Posts

Posted - 2002-12-03 : 13:00:24
I have an employee database that I need to run a query against and return random employees. Any suggestions. It is for a program to pick people to go take a drug test. Government requires it to be totally random selections.

nr
SQLTeam MVY

12543 Posts

Posted - 2002-12-03 : 13:28:38
select top 10 name
from tbl
order by newid()

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

baldeep
Starting Member

18 Posts

Posted - 2002-12-03 : 14:53:43
nr,

I don't really understand GUIDs too well. Particularly I'm curious to know how these IDs are guaranteed to be unique. (Do I win a prize if I find two GUIDs that are the same? (-; )

This may have implications to this application (generating a random ordering). How is the distribution guaranteed to be uniform?

Does anyone know of any resources I can read up on to learn more about these?

--Baldeep

Go to Top of Page

Andraax
Aged Yak Warrior

790 Posts

Posted - 2002-12-03 : 15:13:40
Hi baldeep. The GUID is based on the MAC-address of one of the servers network cards, and a timestamp from the CPU which is said to be unique.

Of course, if two servers have the same network card MAC address, then they could generate a duplicate GUID. There's not a very big likelihood for that though, as duplicate MAC addresses are not very common. But they do exist.

quote:
How is the distribution guaranteed to be uniform?


Dunno exactly what you mean by uniform, but ordering by the guids generated by newid() should make it random. Random can of course still mean that all the rows are next to each other in physical storage or in (for example) alphabetical order.



Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-12-03 : 15:24:25
The algorithm used by Windows NT 4.0 (and earlier, I believe) generates sequential GUID values; Windows 2000 and higher generate random GUIDs. They are still unique under either OS however.

FWIW, there are 2^128 possible GUID values. If you generated a quadrillion of these values every second, you would generate them all in about 10,760,801,423,071,600 years.



Don't worry, you're not gonna create a duplicate.


Go to Top of Page

graz
Chief SQLTeam Crack Dealer

4149 Posts

Posted - 2002-12-03 : 15:39:15
quote:

Does anyone know of any resources I can read up on to learn more about these?



Yep :) I'd suggest the article right here on SQLTeam at http://www.sqlteam.com/item.asp?ItemID=8747 . Good luck.

===============================================
Creating tomorrow's legacy systems today.
One crisis at a time.
Go to Top of Page

baldeep
Starting Member

18 Posts

Posted - 2002-12-03 : 15:42:33
Andraax and Rob,

Thanks for the help.

There may be 2^128 total possible GUID values, but if they are based on MAC Address and CPU ID I can see how the possible number for one computer could be limited. I'm just going to give this a simplistic treatment to make my point.

MAC addresses are 6 bytes, so there are 2^48 possible MAC addresses. I believe CPUIDs are 64 bits. That's a lot of potential combinations of CPUIDs and MAC Addresses--2^112. So that means that each computer basically has a GUID-space of size 2^128/2^112 = 2^16.

2^16 seems really small. But really it doesn't matter too much if I've arrived at the correct number or not. Besides it's been a long time since I did math. More important is the idea that the GUID-space for each computer is limited. If the GUID is generated randomly from this space, then the space itself must have a uniform distribution in order for the resultant GUIDs to be random and uniformly distributed.

So, without more information, I wouldn't guarantee that "SELECT TOP * FROM Tbl ORDER BY NEWID()" will generate a random list of employees. There may be a skew in the results... no? Somebody please check my logic.

--Baldeep

Go to Top of Page

graz
Chief SQLTeam Crack Dealer

4149 Posts

Posted - 2002-12-03 : 16:05:52
If GUIDs are ordered by MAC address, CPUID, etc I think your comments may be correct. I don't think they are. I think those values are used to generate the GUID. In my testing, they appeared random. You can use the script I wrote to test this randomness and see if it meets your needs. I didn't apply a rigorous statistical definition of randomness. You could certainly do that in your testing. Based on your requirements I think the only way to be sure is to test it yourself.

In any case, I've never heard of one being duplicated. I think the concern is whether or not they generate a random order. I have a number of different articles on the site on randomly selecting a record. I'd suggest you take a look through them. Many of them don't rely on a GUID.

This article(http://www.opennc.org/dce/info/draft-leach-uuids-guids-01.txt) describes a way to create a GUID. There was a post on a message board that indicated MS may have based their GUID generation on this methodology -- that's kind of sketchy though. I did a quick search of the MS site to see if there was an article on how they were generated but didn't find anything. I know Microsoft suggests using GUID's to randomly order rows.

===============================================
Creating tomorrow's legacy systems today.
One crisis at a time.
Go to Top of Page

baldeep
Starting Member

18 Posts

Posted - 2002-12-03 : 16:35:38
Thanks Graz.

Actually I thought about it and my logic is flawed . The MAC Address and CPU ID don't change based on the record, so barring any strangeness in the GUID-generation algorithm, the ORDER of the records should be just as random as the order of any pseudo-random numbers on which the GUID is based. Based on what I know, the distribution of GUIDs on any given machine isn't necessarily uniform, though it could be....

--Baldeep

Go to Top of Page

MichaelP
Jedi Yak

2489 Posts

Posted - 2002-12-03 : 17:23:31
Just a comment on this.

I have been able to generate a few duplicate GUID's in the past, but I had to generate them in a test app that generated them in an infinite loop. I was only able to create a few duplicates, and they were together (ie created at pretty much the same instant in time).

Bottom line, it's very hard to generate duplicate GUIDs.

Michael

<Yoda>Use the Search page you must. Find the answer you will.</Yoda>
Go to Top of Page
   

- Advertisement -