Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Random Selection

Author  Topic 

nice123ej
Starting Member

48 Posts

Posted - 2007-07-08 : 09:23:15
We can use Top 10 in Select statment
i heared we can select Random 10 in SQL Server 2005
is that true?!

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2007-07-08 : 09:27:56
you could do the same in the SS2000

select top 10 ...
from ...
order by newid()

or you can use the new TABLESAMPLE clause

SELECT *
FROM myTable TABLESAMPLE SYSTEM (10 ROWS) ;



_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-07-08 : 11:23:11
1) TABLESAMPLE cannot be applied to derived tables, tables from linked servers, and tables derived from table-valued functions, rowset functions, or OPENXML. TABLESAMPLE cannot be specified in the definition of a view or an inline table-valued function.

2) TABLESAMPLE was introduced in SQL Server 2005. When TABLESAMPLE is used against databases that are upgraded from an earlier version, the compatibility level of the database must be set to at least 90. To set the database compatibility level, see ALTER DATABASE (Transact-SQL).

3) If you really want a random sample of individual rows, modify your query to filter out rows randomly, instead of using TABLESAMPLE.

Peter Larsson
Helsingborg, Sweden
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-07-08 : 11:25:27
All above can be read in Books Online for "Limiting Result Sets by Using TABLESAMPLE".


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

PSamsig
Constraint Violating Yak Guru

384 Posts

Posted - 2007-07-08 : 16:00:39
So what you are saying is that ORDER BY NEWID() is still the way to go?

-- If you give someone a program, you will frustrate them for a day; if you teach them how to program, you will frustrate them for a lifetime.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-07-08 : 16:17:40
quote:
B. Selecting a percentage of rows with a seed value
The following statement returns the same set of rows every time that it is executed. The seed value of 205 was chosen arbitrarily.

Copy Code
USE AdventureWorks ;
GO
SELECT FirstName, LastName
FROM Person.Contact
TABLESAMPLE (10 PERCENT)
REPEATABLE (205) ;

quote:
Important:
The TABLESAMPLE SYSTEM clause should be used with some caution, and with some understanding of some of the implications of using sampling. For example, a join of two tables is likely to return a match for each row in both tables; however, if TABLESAMPLE SYSTEM is specified for either of the two tables, some rows returned from the unsampled table are unlikely to have a matching row in the sampled table. This behavior might lead you to suspect that a data consistency problem exists in the underlying tables, when the data is actually valid. Similarly, if TABLESAMPLE SYSTEM is specified for both tables that are joined, the perceived problem could be even worse.


Even Microsoft seems to prefer NEWID()
quote:
Important:
If you really want a random sample of individual rows, modify your query to filter out rows randomly, instead of using TABLESAMPLE. For example, the following query uses the NEWID function to return approximately one percent of the rows of the Sales.SalesOrderDetail table:

SELECT * FROM Sales.SalesOrderDetail

WHERE 0.01 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)

/ CAST (0x7fffffff AS int)

The SalesOrderID column is included in the CHECKSUM expression so that NEWID() evaluates once per row to achieve sampling on a per-row basis. The expression CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float / CAST (0x7fffffff AS int) evaluates to a random float value between 0 and 1.


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

PSamsig
Constraint Violating Yak Guru

384 Posts

Posted - 2007-07-08 : 16:54:20
quote:
Originally posted by Microsoft

SELECT * FROM Sales.SalesOrderDetail 
WHERE 0.01 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float) / CAST (0x7fffffff AS int)

What on earth is wrong with:[code]SELECT TOP 1 PERCENT * FROM Sales.SalesOrderDetail ORDER BY NEWID()
Wont both have to run through all rows any way, guess it has to do with unnesserey sorting, have to check a query plan, just not tonight ...


-- If you give someone a program, you will frustrate them for a day; if you teach them how to program, you will frustrate them for a lifetime.
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2007-07-08 : 17:03:19
it's interesting.... not very surprising though.

SELECT * FROM Sales.SalesOrderDetail
WHERE 0.01 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float) / CAST (0x7fffffff AS int)
/*
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.

(1249 row(s) affected)
Table 'SalesOrderDetail'. Scan count 1, logical reads 1238, physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

---------------------------------------------------------------------------------------------------------
SELECT TOP 1 PERCENT *
FROM Sales.SalesOrderDetail
ORDER BY NEWID()
/*
SQL Server Execution Times:
CPU time = 156 ms, elapsed time = 148 ms.

(1214 row(s) affected)
Table 'SalesOrderDetail'. Scan count 3, logical reads 1356, physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 1, logical reads 346929,
physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
*/


note the difference in affected rows. teh MS prefered way gives a different rowcount every time.
also the first exec plan has a query cost (relative to the batch) of 14% and the second one has 86%

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-07-08 : 17:42:20
Well, the first query produces a random number between 0 and 1. If we are lucky ALL numbers are less than 0.01 (not very likely in theory nor practice). The distribution of generated value is not that high.



Peter Larsson
Helsingborg, Sweden
Go to Top of Page

PSamsig
Constraint Violating Yak Guru

384 Posts

Posted - 2007-07-08 : 18:50:02
What you guys doing this late here on a sunday night, go to sleep.

-- If you give someone a program, you will frustrate them for a day; if you teach them how to program, you will frustrate them for a lifetime.
Go to Top of Page
   

- Advertisement -