get random samples - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

Transact-SQL (2008)

get random samples

Author

Topic

xpandre
Posting Yak Master

212 Posts

Posted - 2011-05-11 : 12:56:37

Hi,

What would be the most efficient way to get a random sample set of data from a huge set of data from a table.

For eg :
I need to get the no of call made by a subscriber between date 01/01/2011 and 21/04/2011. In my DB, I would have more than a million records. But I want just 2000 of that. But I need the dates of the call to be truly scattered in the time interval given.
like I need to have at least 1 record from Jan, 1 record from Feb, 1 record from March and 1 record from April.

I tried using NEWID() --> But this is very costly.
I tried using BINARY_CHECKSUM --> But this for some reason doesn't get me the sample as i want..like it gets me all 2000 from Jan and nothing from other months.

Thank you guys in advance
Sam

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-11 : 13:08:58

Is the cost for the newid() method in generation of the newid, or in the sorting based on newid? If it is in the sorting, you might try crypt_gen_random function instead of newid. It generates hexadecimal number (of any length you specify), so sorting may be faster.

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2011-05-11 : 16:04:23

http://msdn.microsoft.com/en-us/library/cc441928.aspx
http://msdn.microsoft.com/en-us/library/ms189108.aspx

xpandre
Posting Yak Master

212 Posts

Posted - 2011-05-12 : 03:47:23

Thank you :-)..i shall try these and let you know my results..

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources