I have a list of items. Each of these items has its own probability.
Can anyone suggest an algorithm to pick an item based on its probability?
Sample code:
double p = Math.random();
double cumulativeProbability = 0.0;
for (Item item : items) {
cumulativeProbability += item.probability();
if (p <= cumulativeProbability) {
return item;
}
}
p
is any random number so how can we say that most probability item get selected first .. eg:[{A,10},{B,20}]
so how can you we say that suppose in first iteration p=2
so 2<=10
is true and first item {A,10}
is getting selected first even though second item have more probabilty
Sep 4, 2014 at 11:58
r
is a uniformly distributed random number, which means that the probability r
be a certain value is equal to all other values r
may be. Thus the items in the list are not "favored" and it doesn't matter where they are in the list.
Feb 5, 2016 at 14:51
So with each item store a number that marks its relative probability, for example if you have 3 items one should be twice as likely to be selected as either of the other two then your list will have:
[{A,1},{B,1},{C,2}]
Then sum the numbers of the list (i.e. 4 in our case). Now generate a random number and choose that index. int index = rand.nextInt(4); return the number such that the index is in the correct range.
Java code:
class Item {
int relativeProb;
String name;
//Getters Setters and Constructor
}
...
class RandomSelector {
List<Item> items = new List();
Random rand = new Random();
int totalSum = 0;
RandomSelector() {
for(Item item : items) {
totalSum = totalSum + item.relativeProb;
}
}
public Item getRandom() {
int index = rand.nextInt(totalSum);
int sum = 0;
int i=0;
while(sum < index ) {
sum = sum + items.get(i++).relativeProb;
}
return items.get(Math.max(0,i-1));
}
}
p
is any random number so how can we say that most probability item get selected first .. eg:[{A,10},{B,20}]
so how can you we say that suppose in first iteration p=2
so 2<=10
is true and first item {A,10}
is getting selected first even though second item have more probabilty
Sep 4, 2014 at 12:00
int index = rand.nextInt(totalSum)
with int index = rand.nextInt(totalSum) + 1
and then of course took out the redundant Math.max
. Without this +1 I was getting twice as many of the "base" (and first) element in the list than expected, i.e. the first element in my list has a relative likelihood 1 by default, and everything else is with respect to that.
pretend that we have the following list
Item A 25%
Item B 15%
Item C 35%
Item D 5%
Item E 20%
Lets pretend that all the probabilities are integers, and assign each item a "range" that calculated as follows.
Start - Sum of probability of all items before
End - Start + own probability
The new numbers are as follows
Item A 0 to 25
Item B 26 to 40
Item C 41 to 75
Item D 76 to 80
Item E 81 to 100
Now pick a random number from 0 to 100. Lets say that you pick 32. 32 falls in Item B's range.
mj
You can try the Roulette Wheel Selection.
First, add all the probabilities, then scale all the probabilities in the scale of 1, by dividing each one by the sum. Suppose the scaled probabilities are A(0.4), B(0.3), C(0.25) and D(0.05)
. Then you can generate a random floating-point number in the range [0, 1)
. Now you can decide like this:
random number in [0.00, 0.40) -> pick A
in [0.40, 0.70) -> pick B
in [0.70, 0.95) -> pick C
in [0.95, 1.00) -> pick D
You can also do it with random integers - say you generate a random integer between 0 to 99 (inclusive), then you can make decision like the above.
if 0.00 <= p < 0.40: pick A, else if 0.40 <= p < 0.70: pick B, ...
.
Mar 6, 2022 at 1:49
Algorithm described in Ushman's, Brent's and @kaushaya's answers are implemented in Apache commons-math library.
Take a look at EnumeratedDistribution class (groovy code follows):
def probabilities = [
new Pair<String, Double>("one", 25),
new Pair<String, Double>("two", 30),
new Pair<String, Double>("three", 45)]
def distribution = new EnumeratedDistribution<String>(probabilities)
println distribution.sample() // here you get one of your values
Note that sum of probabilities doesn't need to be equal to 1 or 100 - it will be normalized automatically.
My method is pretty simple. Generate a random number. Now since the probabilities of your items are known,simply iterate through the sorted list of probability and pick the item whose probability is lesser than the randomly generated number.
For more details,read my answer here.
A slow but simple way to do it is to have every member to pick a random number based on its probability and pick the one with highest value.
Analogy:
Imagine 1 of 3 people needs to be chosen but they have different probabilities. You give them die with different amount of faces. First person's dice has 4 face, 2nd person's 6, and the third person's 8. They roll their die and the one with the biggest number wins.
Lets say we have the following list:
[{A,50},{B,100},{C,200}]
Pseudocode:
A.value = random(0 to 50);
B.value = random(0 to 100);
C.value = random (0 to 200);
We pick the one with the highest value.
This method above does not exactly map the probabilities. For example 100 will not have twice the chance of 50. But we can do it in a by tweaking the method a bit.
Method 2
Instead of picking a number from 0 to the weight we can limit them from the upper limit of previous variable to addition of the current variable.
[{A,50},{B,100},{C,200}]
Pseudocode:
A.lowLimit= 0; A.topLimit=50;
B.lowLimit= A.topLimit+1; B.topLimit= B.lowLimit+100
C.lowLimit= B.topLimit+1; C.topLimit= C.lowLimit+200
resulting limits
A.limits = 0,50
B.limits = 51,151
C.limits = 152,352
Then we pick a random number from 0 to 352 and compare it to each variable's limits to see whether the random number is in its limits.
I believe this tweak has better performance since there is only 1 random generation.
There is a similar method in other answers but this method does not require the total to be 100 or 1.00.
A space-costly way is to clone each item the number of times its probability. Selection will be done in O(1).
For example
//input
[{A,1},{B,1},{C,3}]
// transform into
[{A,1},{B,1},{C,1},{C,1},{C,1}]
Then simply pick any item randomly from this transformed list.
Brent's answer is good, but it doesn't account for the possibility of erroneously choosing an item with a probability of 0 in cases where p = 0. That's easy enough to handle by checking the probability (or perhaps not adding the item in the first place):
double p = Math.random();
double cumulativeProbability = 0.0;
for (Item item : items) {
cumulativeProbability += item.probability();
if (p <= cumulativeProbability && item.probability() != 0) {
return item;
}
}
Adapted the code from https://stackoverflow.com/a/37228927/11257746 into a general extention method. This will allow you to get a weighted random value from a Dictionary with the structure <TKey, int>, where int is a weight value.
A Key that has a value of 50 is 10 times more likely to be chosen than a key with the value of 5.
C# code using LINQ:
/// <summary>
/// Get a random key out of a dictionary which has integer values treated as weights.
/// A key in the dictionary with a weight of 50 is 10 times more likely to be chosen than an element with the weight of 5.
///
/// Example usage to get 1 item:
/// Dictionary<MyType, int> myTypes;
/// MyType chosenType = myTypes.GetWeightedRandomKey<MyType, int>().First();
///
/// Adapted into a general extention method from https://stackoverflow.com/a/37228927/11257746
/// </summary>
public static IEnumerable<TKey> GetWeightedRandomKey<TKey, TValue>(this Dictionary<TKey, int> dictionaryWithWeights)
{
int totalWeights = 0;
foreach (KeyValuePair<TKey, int> pair in dictionaryWithWeights)
{
totalWeights += pair.Value;
}
System.Random random = new System.Random();
while (true)
{
int randomWeight = random.Next(0, totalWeights);
foreach (KeyValuePair<TKey, int> pair in dictionaryWithWeights)
{
int weight = pair.Value;
if (randomWeight - weight > 0)
randomWeight -= weight;
else
{
yield return pair.Key;
break;
}
}
}
}
Example usage:
public enum MyType { Thing1, Thing2, Thing3 }
public Dictionary<MyType, int> MyWeightedDictionary = new Dictionary<MyType, int>();
public void MyVoid()
{
MyWeightedDictionary.Add(MyType.Thing1, 50);
MyWeightedDictionary.Add(MyType.Thing2, 25);
MyWeightedDictionary.Add(MyType.Thing3, 5);
// Get a single random key
MyType myChosenType = MyWeightedDictionary.GetWeightedRandomKey<MyType, int>().First();
// Get 20 random keys
List<MyType> myChosenTypes = MyWeightedDictionary.GetWeightedRandomKey<MyType, int>().Take(20).ToList();
}
If you don't mind adding a third party dependency in your code you can use the MockNeat.probabilities() method.
For example:
String s = mockNeat.probabilites(String.class)
.add(0.1, "A") // 10% chance to pick A
.add(0.2, "B") // 20% chance to pick B
.add(0.5, "C") // 50% chance to pick C
.add(0.2, "D") // 20% chance to pick D
.val();
Disclaimer: I am the author of the library, so I might be biased when I am recommending it.
All mentioned solutions have linear effort. The following has only logarithmic effort and deals also with unnormalized probabilities. I'd reccommend to use a TreeMap rather than a List:
import java.util.*;
import java.util.stream.IntStream;
public class ProbabilityMap<T> extends TreeMap<Double,T>{
private static final long serialVersionUID = 1L;
public static Random random = new Random();
public double sumOfProbabilities;
public Map.Entry<Double,T> next() {
return ceilingEntry(random.nextDouble()*sumOfProbabilities);
}
@Override public T put(Double key, T value) {
return super.put(sumOfProbabilities+=key, value);
}
public static void main(String[] args) {
ProbabilityMap<Integer> map = new ProbabilityMap<>();
map.put(0.1,1); map.put(0.3,3); map.put(0.2,2);
IntStream.range(0, 10).forEach(i->System.out.println(map.next()));
}
}
You could use the Julia code:
function selrnd(a::Vector{Int})
c = a[:]
sumc = c[1]
for i=2:length(c)
sumc += c[i]
c[i] += c[i-1]
end
r = rand()*sumc
for i=1:length(c)
if r <= c[i]
return i
end
end
end
This function returns the index of an item efficiently.