All Questions
Tagged with probability statistics
612
questions
596
votes
15
answers
76k
views
Cosmic Rays: what is the probability they will affect a program?
Once again I was in a design review, and encountered the claim that the probability of a particular scenario was "less than the risk of cosmic rays" affecting the program, and it occurred to me that I ...
124
votes
10
answers
316k
views
How to calculate probability in a normal distribution given mean & standard deviation?
How to calculate probability in normal distribution given mean, std in Python? I can always explicitly code my own function according to the definition like the OP in this question did: Calculating ...
79
votes
14
answers
41k
views
Select k random elements from a list whose elements have weights
Selecting without any weights (equal probabilities) is beautifully described here.
I was wondering if there is a way to convert this approach to a weighted one.
I am also interested in other ...
33
votes
7
answers
47k
views
How do I programmatically calculate Poker Odds? [closed]
I'm trying to write a simple game/utility to calculate poker odds. I know there's plenty of resources that talk about the formulas to do so, but I guess I'm having trouble translating that to code. ...
30
votes
9
answers
13k
views
How to do weighted random sample of categories in python
Given a list of tuples where each tuple consists of a probability and an item I'd like to sample an item according to its probability. For example, give the list [ (.3, 'a'), (.4, 'b'), (.3, 'c')] I'd ...
27
votes
8
answers
8k
views
Select random row from a PostgreSQL table with weighted row probabilities
Example input:
SELECT * FROM test;
id | percent
----+----------
1 | 50
2 | 35
3 | 15
(3 rows)
How would you write such query, that on average 50% of time i could get the row with id=...
26
votes
4
answers
43k
views
Confidence interval for binomial data in R?
I know that I need mean and s.d to find the interval, however, what if the question is:
For a survey of 1,000 randomly chosen workers, 520 of them are female. Create a 95% confidence interval for the ...
26
votes
9
answers
4k
views
What's the best way to unit test code that generates random output?
Specifically, I've got a method picks n items from a list in such a way that a% of them meet one criterion, and b% meet a second, and so on. A simplified example would be to pick 5 items where 50% ...
24
votes
7
answers
12k
views
Computing similarity between two lists
EDIT:
as everyone is getting confused, I want to simplify my question. I have two ordered lists. Now, I just want to compute how similar one list is to the other.
Eg,
1,7,4,5,8,9
1,7,5,4,9,6
What ...
23
votes
2
answers
21k
views
Fitting distributions, goodness of fit, p-value. Is it possible to do this with Scipy (Python)?
INTRODUCTION: I'm a bioinformatician. In my analysis which I perform on all human genes (about 20 000) I search for a particular short sequence motif to check how many times this motif occurs in each ...
22
votes
10
answers
4k
views
Representing continuous probability distributions
I have a problem involving a collection of continuous probability distribution functions, most of which are determined empirically (e.g. departure times, transit times). What I need is some way of ...
22
votes
2
answers
7k
views
PyMC3 Bayesian Linear Regression prediction with sklearn.datasets
I've been trying to implement Bayesian Linear Regression models using PyMC3 with REAL DATA (i.e. not from linear function + gaussian noise) from the datasets in sklearn.datasets. I chose the ...
22
votes
1
answer
15k
views
How to simulate bimodal distribution?
I have the following code to generate bimodal distribution but when I graph the histogram. I don't see the 2 modes. I am wondering if there's something wrong with my code.
mu1 <- log(1)
mu2 &...
18
votes
6
answers
8k
views
Nth Combination
Is there a direct way of getting the Nth combination of an ordered set of all combinations of nCr?
Example: I have four elements: [6, 4, 2, 1]. All the possible combinations by taking three at a time ...
16
votes
4
answers
24k
views
Probability of 64bit Hash Code Collisions
The book Numerical Recipes offers a method to calculate 64bit hash codes in order to reduce the number of collisions.
The algorithm is shown at http://www.javamex.com/tutorials/collections/...
14
votes
3
answers
11k
views
how to show that NDCG score is significant
Suppose the NDCG score for my retrieval system is .8. How do I interpret this score. How do i tell the reader that this score is significant?
14
votes
5
answers
3k
views
Choose random array element satisfying certain property
Suppose I have a list, called elements, each of which does or does not satisfy some boolean property p. I want to choose one of the elements that satisfies p by random with uniform distribution. I ...
12
votes
6
answers
4k
views
How to generate correlated binary variables
I need to generate a series of N random binary variables with a given correlation function. Let x = {xi} be a series of binary variables (taking the value 0 or 1, i running from 1 to N). The marginal ...
11
votes
5
answers
25k
views
Which java-library computes the cumulative standard normal distribution function?
For a project I have a specification with formulas, I have to implement. In these formulas a cumulative standard normal distribution function exists, that takes a float and outputs a probability. The ...
11
votes
1
answer
52k
views
Distribution plot of an array
I have a numpy array containing float values in [-10..10]. I would like to plot a distribution-graph of the values, like this (here it is done for a binomial random variable) :
For example I would ...
11
votes
2
answers
5k
views
Uniform distribution from a fractal Perlin noise function in C#
My Perlin noise function (which adds up 6 octaves of 3D simplex at 0.75 persistence) generates a 2D array array of doubles.
These numbers each come out normalized to [-1, 1], with mean at 0. I clamp ...
9
votes
2
answers
537
views
Student's t-distribution CDF R base documentation
In the context of the Student's t-distribution cumulative distribution function, R Version 4.3.1's ?dt documentation highlights the following result:
However, upon attempting to verify the accuracy ...
8
votes
1
answer
1k
views
DistributionFitTest[] for custom distributions in Mathematica
I have PDFs and CDFs for two custom distributions, a means of generating RandomVariates for each, and code for fitting parameters to data. Some of this code I've posted previously at:
Calculating ...
8
votes
3
answers
10k
views
Determining if the difference between two error values is significant
I'm evaluating a number of different algorithms whose job is to predict the probability of an event occurring.
I am testing the algorithms on large-ish datasets. I measure their effectiveness using "...
8
votes
3
answers
6k
views
Finding stationary distribution of a markov process given a transition probability matrix
There has been two threads related to this issue on Stack Overflow:
How can I obtain stationary distribution of a Markov Chain given a transition probability matrix describes what a transition ...
8
votes
4
answers
6k
views
Calculating pdf of Dirichlet distribution in python
I'd like to calculate the pdf for the Dirichlet distribution in python, but haven't been able to find code to do so in any kind of standard library. scipy.stats includes a long list of distributions ...
8
votes
4
answers
6k
views
how to numerically sample from a joint, discrete, probability distribution function
I have a 2D "heat map" or PDF that I need to recreate by random sampling. I.E. I have a 2D probability density map showing starting locations. I need to randomly choose starting locations with the ...
8
votes
5
answers
3k
views
Histogram matching - image processing - c/c++
I have two histograms.
int Hist1[10] = {1,4,3,5,2,5,4,6,3,2};
int Hist1[10] = {1,4,3,15,12,15,4,6,3,2};
Hist1's distribution is of type multi-modal;
Hist2's distribution is of type uni-modal with ...
7
votes
6
answers
14k
views
Is Pythons random.randint statistically random?
So I'm testing an calculating the probabilities of certain dice rolls, for a game.
The base case if that rolling one 10sided die.
I did a million samples of this, and ended up with the following ...
7
votes
6
answers
5k
views
Ruby: Using rand() in code but writing tests to verify probabilities
I have some code which delivers things based on weighted random. Things with more weight are more likely to be randomly chosen. Now being a good rubyist I of couse want to cover all this code with ...
7
votes
5
answers
13k
views
Create constrained random numbers?
CLEANED UP TEXT:
How can I create m=5 random numbers that add upp to, say n=100. But, the first random number is say, 10 < x1 < 30, the second random nr is 5 < x2 < 20, the third random ...
7
votes
3
answers
2k
views
Group detection in data sets
Assume a group of data points, such as one plotted here (this graph isn't specific to my problem, but just used as a suitable example):
Inspecting the scatter graph visually, it's fairly obvious the ...
7
votes
1
answer
7k
views
How can I sample a multivariate log-normal distribution in Python?
Using Python, how can I sample data from a multivariate log-normal distribution? For instance, for a multivariate normal, there are two options. Let's assume we have a 3 x 3 covariance matrix and a 3-...
7
votes
4
answers
4k
views
Select x random elements from a weighted list in C# (without replacement)
Update: my problem has been solved, I updated the code source in my question to match with Jason's answer. Note that rikitikitik answer is solving the issue of picking cards from a sample with ...
7
votes
1
answer
2k
views
Why do the inverse t-distributions for small values differ in Matlab and R?
I would like to evaluate the inverse Student's t-distribution function for small values, e.g., 1e-18, in Matlab. The degrees of freedom is 2.
Unfortunately, Matlab returns NaN:
tinv(1e-18,2)
NaN
...
7
votes
1
answer
2k
views
What are the ways of deciding probabilities in hidden markov models?
I am starting to learn hidden markov models and on the wiki page, as well as on github there are alot of examples but most of the probabilities are already there(70% change of rain, 30% chance of ...
6
votes
3
answers
345
views
How to approach this algorithm question?
A website has a database of n questions.
You click a button and are shown one random question per click. The probability of a particular question showing up at the click event is 1/n.
On average, how ...
6
votes
1
answer
13k
views
Kolmogorov-Smirnov test
I'm using the R function ks.test() to test the Uniform distribution of the R random number generator. I'm using the following code:
replicate(100000, ks.test(runif(n),y="punif").
When n is less than ...
6
votes
4
answers
18k
views
Python: Selecting numbers with associated probabilities [duplicate]
Possible Duplicates:
Random weighted choice
Generate random numbers with a given (numerical) distribution
I have a list of list which contains a series on numbers and there associated ...
6
votes
5
answers
4k
views
Efficient Method for Calculating the Probability of a Set of Outcomes?
Let's say I'm playing 10 different games. For each game, I know the probability of winning, the probability of tying, and the probability of losing (each game has different probabilities).
From ...
6
votes
2
answers
2k
views
C++ - critical values probability distribution
were can I find reliable code for critical values for probability distributions? For instance, F critical values for the fisher test...
?
Thanks for any relevant reference.
6
votes
4
answers
514
views
Probability distribution for sms answer delays
I'm writing an app using sms as communication.
I have chosen to subscribe to an sms-gateway, which provides me with an API for doing so.
The API has functions for sending as well as pulling new ...
6
votes
2
answers
1k
views
c++ discrete distribution sampling with frequently changing probabilities
Problem: I need to sample from a discrete distribution constructed of certain weights e.g. {w1,w2,w3,..}, and thus probability distribution {p1,p2,p3,...}, where pi=wi/(w1+w2+...).
some of wi's ...
5
votes
1
answer
5k
views
Scipy Multivariate Normal: How to draw deterministic samples?
I am using Scipy.stats.multivariate_normal to draw samples from a multivariate normal distribution. Like this:
from scipy.stats import multivariate_normal
# Assume we have means and covs
mn = ...
5
votes
2
answers
1k
views
Fastest approximate counting algorithm
Whats the fastest way to get an approximate count of number of rows of an input file or std out data stream. FYI, this is a probabilistic algorithm, I can't find many examples online.
The data could ...
5
votes
1
answer
2k
views
How to compute CDF probability of normal distribution in C++?
Is there any function that allow me to compute the CDF probability of a normal distribution, given a mean and sigma ? i.e. for example P( X < x ) given the normal distribution with $\bar{x}$ and $\...
5
votes
1
answer
7k
views
Calculate F-distribution p values in python?
Suppose that I have an F value and the associated degrees of freedom, df1 and df2. How can I use python to programmatically calculate the p value associated with these numbers?
Note: I would not ...
5
votes
4
answers
992
views
Random numbers probability
I am trying to randomly choose from e.g. 4 numbers. I need to compare the probability of these 2 algorithms.
1#
int a = random.Next(0, 4);
if (a = 0)
...
5
votes
2
answers
796
views
How can I find out how many rows of a matrix satisfy a rather complicated criterion (in R)?
As an example, here is a way to get a matrix of all possible outcomes of rolling 4 (fair) dice.
z <- as.matrix(expand.grid(c(1:6),c(1:6),c(1:6),c(1:6)))
As you may already have understood, I'm ...
5
votes
7
answers
248
views
algorithms to evaluate user responses
I'm working on a web application which will be used for classifying photos of automobiles. The users will be presented with photos of various vehicles, and will be asked to answer a series of ...