| Author |
Topic |
|
shapper
Constraint Violating Yak Guru
450 Posts |
Posted - 2011-04-24 : 17:30:10
|
| Hello,I am planning a SQL table where later I need to:1 - Get a random record. I know I can do this ordering by Guid.2 - Change the probability of each record to be selected. Using a scale with the multiples of 10 would be enough: 10%, 20%, ..., 90%, 100%.Is this possible?Thank You,Miguel |
|
|
nigelrivett
Master Smack Fu Yak Hacker
3385 Posts |
Posted - 2011-04-24 : 18:50:47
|
| How do you want to specify the multiples for each row? Do you have a very small number and can do it individually?==========================================Cursors are useful if you don't know sql.SSIS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2011-04-24 : 21:26:06
|
To follow up on Nigel's question, if I understand your requirement::: when probability reaches 20%, one out of 5 records will be selected.:: when probability is 100% all the records will be selected.If that is true, could you do something simple like this?declare @TotalRecords int; set @TotalRecords = 1000;declare @probability float; set @probability = 0.2; -- 20% probability that a record will be selectedselect top (cast(@TotalRecords*@probability as int)) * from YourTable order by newid(); |
 |
|
|
shapper
Constraint Violating Yak Guru
450 Posts |
Posted - 2011-04-24 : 22:45:43
|
| Hi,Let me try to explain better:Each table's row has a column named weight.That weight can be 10, 20, 30, ..., 90, 100.So I want to randomly select ONE row but where the rows with weight 100 have a higher probability of being selected than the rows with weight 10.Does this make sense?Thank You,Miguel |
 |
|
|
khtan
In (Som, Ni, Yak)
17689 Posts |
Posted - 2011-04-24 : 23:50:20
|
is this good enough ? Hopefully your table is not too big and the "weight" is within reasonable limit.declare @sample table( row_no int identity(1,1), weight int, cnt int default 0)insert into @sample (weight) select 1insert into @sample (weight) select 2insert into @sample (weight) select 5insert into @sample (weight) select 6insert into @sample (weight) select 8declare @i int, @row_no intselect @i = 1-- runs it 1000 timewhile @i <= 1000begin select top 1 @row_no = row_no from @sample s cross join numbers n where n.number between 1 and s.weight order by newid() update @sample set cnt = cnt + 1 where row_no = @row_no select @i = @i + 1endselect *, weight_ratio = convert(decimal(10,2), weight * 100.0 / sum(weight) over()), cnt_ratio = convert(decimal(10,2), cnt * 100.0 / sum(cnt) over())from @sample Note : numbers is just a tally table. KH[spoiler]Time is always against us[/spoiler] |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2011-04-25 : 07:27:26
|
Here is my take on it. I am using the same data as KH, but its a different approach. What I am doing is:1. Calculate the running total of the weights. The range between the running total of a row and the running total of the next row "belongs" to the row.2. Calculate a random number less than max(running_total) and pick the row to which the random number "belongs".create table #tmp( row_no int identity(1,1), weight int, cnt int default 0);insert into #tmp (weight) select 1;insert into #tmp (weight) select 2;insert into #tmp (weight) select 5;insert into #tmp (weight) select 6;insert into #tmp (weight) select 8;with CTE as( select row_no, weight as running_total from #tmp where row_no = 1 union all select t.row_no, t.weight+c.running_total from #tmp t inner join CTE c on c.row_no+1 = t.row_no)update t set cnt = running_totalfrom #tmp t inner join CTE c on c.row_no = t.row_no; select * from #tmp;create unique index idx_cnt on #tmp(cnt);-- pick record to select.declare @n intselect @n = max(cnt) from #tmp;select top 1 *from #tmp where cnt >= abs(cast(cast(newid() as varbinary) as bigint))%(@n+1)order by cnt;drop table #tmp Edit: Added where row_no = 1 in the anchor portion of the CTE. |
 |
|
|
khtan
In (Som, Ni, Yak)
17689 Posts |
Posted - 2011-04-25 : 08:03:20
|
this is much more efficient than my method  KH[spoiler]Time is always against us[/spoiler] |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2011-04-25 : 12:19:09
|
But let us see if shapper thinks if either approach is any good before declaring victory |
 |
|
|
|