| Author |
Topic |
|
vwilsonjr
Starting Member
45 Posts |
Posted - 2002-12-03 : 13:00:24
|
| I have an employee database that I need to run a query against and return random employees. Any suggestions. It is for a program to pick people to go take a drug test. Government requires it to be totally random selections. |
|
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2002-12-03 : 13:28:38
|
| select top 10 namefrom tblorder by newid()==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
|
baldeep
Starting Member
18 Posts |
Posted - 2002-12-03 : 14:53:43
|
| nr,I don't really understand GUIDs too well. Particularly I'm curious to know how these IDs are guaranteed to be unique. (Do I win a prize if I find two GUIDs that are the same? (-; )This may have implications to this application (generating a random ordering). How is the distribution guaranteed to be uniform?Does anyone know of any resources I can read up on to learn more about these?--Baldeep |
 |
|
|
Andraax
Aged Yak Warrior
790 Posts |
Posted - 2002-12-03 : 15:13:40
|
Hi baldeep. The GUID is based on the MAC-address of one of the servers network cards, and a timestamp from the CPU which is said to be unique.Of course, if two servers have the same network card MAC address, then they could generate a duplicate GUID. There's not a very big likelihood for that though, as duplicate MAC addresses are not very common. But they do exist.quote: How is the distribution guaranteed to be uniform?
Dunno exactly what you mean by uniform, but ordering by the guids generated by newid() should make it random. Random can of course still mean that all the rows are next to each other in physical storage or in (for example) alphabetical order. |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-12-03 : 15:24:25
|
The algorithm used by Windows NT 4.0 (and earlier, I believe) generates sequential GUID values; Windows 2000 and higher generate random GUIDs. They are still unique under either OS however.FWIW, there are 2^128 possible GUID values. If you generated a quadrillion of these values every second, you would generate them all in about 10,760,801,423,071,600 years. Don't worry, you're not gonna create a duplicate. |
 |
|
|
graz
Chief SQLTeam Crack Dealer
4149 Posts |
Posted - 2002-12-03 : 15:39:15
|
quote: Does anyone know of any resources I can read up on to learn more about these?
Yep :) I'd suggest the article right here on SQLTeam at http://www.sqlteam.com/item.asp?ItemID=8747 . Good luck.===============================================Creating tomorrow's legacy systems today.One crisis at a time. |
 |
|
|
baldeep
Starting Member
18 Posts |
Posted - 2002-12-03 : 15:42:33
|
| Andraax and Rob,Thanks for the help.There may be 2^128 total possible GUID values, but if they are based on MAC Address and CPU ID I can see how the possible number for one computer could be limited. I'm just going to give this a simplistic treatment to make my point.MAC addresses are 6 bytes, so there are 2^48 possible MAC addresses. I believe CPUIDs are 64 bits. That's a lot of potential combinations of CPUIDs and MAC Addresses--2^112. So that means that each computer basically has a GUID-space of size 2^128/2^112 = 2^16.2^16 seems really small. But really it doesn't matter too much if I've arrived at the correct number or not. Besides it's been a long time since I did math. More important is the idea that the GUID-space for each computer is limited. If the GUID is generated randomly from this space, then the space itself must have a uniform distribution in order for the resultant GUIDs to be random and uniformly distributed.So, without more information, I wouldn't guarantee that "SELECT TOP * FROM Tbl ORDER BY NEWID()" will generate a random list of employees. There may be a skew in the results... no? Somebody please check my logic.--Baldeep |
 |
|
|
graz
Chief SQLTeam Crack Dealer
4149 Posts |
Posted - 2002-12-03 : 16:05:52
|
| If GUIDs are ordered by MAC address, CPUID, etc I think your comments may be correct. I don't think they are. I think those values are used to generate the GUID. In my testing, they appeared random. You can use the script I wrote to test this randomness and see if it meets your needs. I didn't apply a rigorous statistical definition of randomness. You could certainly do that in your testing. Based on your requirements I think the only way to be sure is to test it yourself.In any case, I've never heard of one being duplicated. I think the concern is whether or not they generate a random order. I have a number of different articles on the site on randomly selecting a record. I'd suggest you take a look through them. Many of them don't rely on a GUID.This article(http://www.opennc.org/dce/info/draft-leach-uuids-guids-01.txt) describes a way to create a GUID. There was a post on a message board that indicated MS may have based their GUID generation on this methodology -- that's kind of sketchy though. I did a quick search of the MS site to see if there was an article on how they were generated but didn't find anything. I know Microsoft suggests using GUID's to randomly order rows.===============================================Creating tomorrow's legacy systems today.One crisis at a time. |
 |
|
|
baldeep
Starting Member
18 Posts |
Posted - 2002-12-03 : 16:35:38
|
Thanks Graz.Actually I thought about it and my logic is flawed . The MAC Address and CPU ID don't change based on the record, so barring any strangeness in the GUID-generation algorithm, the ORDER of the records should be just as random as the order of any pseudo-random numbers on which the GUID is based. Based on what I know, the distribution of GUIDs on any given machine isn't necessarily uniform, though it could be....--Baldeep |
 |
|
|
MichaelP
Jedi Yak
2489 Posts |
Posted - 2002-12-03 : 17:23:31
|
| Just a comment on this.I have been able to generate a few duplicate GUID's in the past, but I had to generate them in a test app that generated them in an infinite loop. I was only able to create a few duplicates, and they were together (ie created at pretty much the same instant in time). Bottom line, it's very hard to generate duplicate GUIDs.Michael<Yoda>Use the Search page you must. Find the answer you will.</Yoda> |
 |
|
|
|