Questions tagged [probability]

Consider if your question would be better at stats.stackexchange.com. Probability touches upon uncertainty, random phenomena, random numbers, random variables, probability distributions, sampling, combinatorics.

probability
Filter by
Sorted by
Tagged with
25 votes
4 answers
13k views

Is there a Python equivalent to R's sample() function?

I want to know if Python has an equivalent to the sample() function in R. The sample() function takes a sample of the specified size from the elements of x using either with or without replacement. ...
Bilal's user avatar
  • 3,082
24 votes
3 answers
28k views

How to properly hash the custom struct?

In the C++ language there is the default hash-function template std::hash<T> for the most simple types, like std::string, int, etc. I suppose, that these functions have a good entropy and the ...
abyss.7's user avatar
  • 14.2k
24 votes
7 answers
12k views

Computing similarity between two lists

EDIT: as everyone is getting confused, I want to simplify my question. I have two ordered lists. Now, I just want to compute how similar one list is to the other. Eg, 1,7,4,5,8,9 1,7,5,4,9,6 What ...
user1221572's user avatar
23 votes
5 answers
47k views

Algorithm to generate Poisson and binomial random numbers?

i've been looking around, but i'm not sure how to do it. i've found this page which, in the last paragraph, says: A simple generator for random numbers taken from a Poisson distribution is obtained ...
user avatar
23 votes
2 answers
21k views

Fitting distributions, goodness of fit, p-value. Is it possible to do this with Scipy (Python)?

INTRODUCTION: I'm a bioinformatician. In my analysis which I perform on all human genes (about 20 000) I search for a particular short sequence motif to check how many times this motif occurs in each ...
s_sherly's user avatar
  • 2,347
23 votes
4 answers
24k views

Probability Random Number Generator

Let's say I'm writing a simple luck game - each player presses Enter and the game assigns him a random number between 1-6. Just like a cube. At the end of the game, the player with the highest number ...
Alon Gubkin's user avatar
  • 56.9k
23 votes
1 answer
3k views

sampling multinomial from small log probability vectors in numpy/scipy

Is there a function in numpy/scipy that lets you sample multinomial from a vector of small log probabilities, without losing precision? example: # sample element randomly from these log probabilities ...
lgd's user avatar
  • 1,472
22 votes
10 answers
4k views

Representing continuous probability distributions

I have a problem involving a collection of continuous probability distribution functions, most of which are determined empirically (e.g. departure times, transit times). What I need is some way of ...
22 votes
2 answers
7k views

PyMC3 Bayesian Linear Regression prediction with sklearn.datasets

I've been trying to implement Bayesian Linear Regression models using PyMC3 with REAL DATA (i.e. not from linear function + gaussian noise) from the datasets in sklearn.datasets. I chose the ...
O.rka's user avatar
  • 30.6k
22 votes
1 answer
15k views

How to simulate bimodal distribution?

I have the following code to generate bimodal distribution but when I graph the histogram. I don't see the 2 modes. I am wondering if there's something wrong with my code. mu1 <- log(1) mu2 &...
Amateur's user avatar
  • 1,257
21 votes
3 answers
30k views

How to compute the probability of a value given a list of samples from a distribution in Python?

Not sure if this belongs in statistics, but I am trying to use Python to achieve this. I essentially just have a list of integers: data = [300,244,543,1011,300,125,300 ... ] And I would like to know ...
qazplok11's user avatar
  • 447
21 votes
12 answers
21k views

Probability distribution in Python

I have a bunch of keys that each have an unlikeliness variable. I want to randomly choose one of these keys, yet I want it to be more unlikely for unlikely (key, values) to be chosen than a less ...
21 votes
5 answers
13k views

permutation & combinations interview

This is a good one because it's so counter-intuitive: Imagine an urn filled with balls, two-thirds of which are of one color and one-third of which are of another. One individual has drawn 5 balls ...
ʞɔıu's user avatar
  • 47.8k
21 votes
9 answers
28k views

Optimal Algorithm for Winning Hangman

In the game Hangman, is it the case that a greedy letter-frequency algorithm is equivalent to a best-chance-of-winning algorithm? Is there ever a case where it's worth sacrificing preservation of ...
Ronald's user avatar
  • 325
20 votes
7 answers
12k views

Unbiased random number generator using a biased one

You have a biased random number generator that produces a 1 with a probability p and 0 with a probability (1-p). You do not know the value of p. Using this make an unbiased random number generator ...
Rohit Banga's user avatar
  • 18.6k
20 votes
7 answers
16k views

What is the probability of guessing (matching) a Guid?

Just curious but what is the probability of matching a Guid? Say a Guid from SQL server: 5AC7E650-CFC3-4534-803C-E7E5BBE29B3D is it a factorial?: (36*32)! = (1152)! discuss =D
RhinoDevX64's user avatar
19 votes
6 answers
12k views

Generating N numbers that sum to 1

Given an array of size n I want to generate random probabilities for each index such that Sigma(a[0]..a[n-1])=1 One possible result might be: 0 1 2 3 4 0.15 0.2 0.18 0.22 0.25 ...
Yuval Adam's user avatar
  • 163k
19 votes
2 answers
25k views

How does the predict_proba() function in LightGBM work internally?

This is in reference to understanding, internally, how the probabilities for a class are predicted using LightGBM. Other packages, like sklearn, provide thorough detail for their classifiers. For ...
artemis's user avatar
  • 7,057
19 votes
4 answers
14k views

Calculate the number of ways to roll a certain number

I'm a high school Computer Science student, and today I was given a problem to: Program Description: There is a belief among dice players that in throwing three dice a ten is easier to get than ...
scrblnrd3's user avatar
  • 7,328
18 votes
6 answers
19k views

What is the probability of collision with a 6 digit random alphanumeric code?

I'm using the following perl code to generate random alphanumeric strings (uppercase letters and numbers, only) to use as unique identifiers for records in my MySQL database. The database is likely to ...
Nick's user avatar
  • 1,311
18 votes
10 answers
30k views

How can I efficiently calculate the binomial cumulative distribution function?

Let's say that I know the probability of a "success" is P. I run the test N times, and I see S successes. The test is akin to tossing an unevenly weighted coin (perhaps heads is a success, tails is ...
sanity's user avatar
  • 35.5k
18 votes
6 answers
8k views

Nth Combination

Is there a direct way of getting the Nth combination of an ordered set of all combinations of nCr? Example: I have four elements: [6, 4, 2, 1]. All the possible combinations by taking three at a time ...
Sami's user avatar
  • 3,263
18 votes
2 answers
2k views

Effective Java Item 47: Know and use your libraries - Flawed random integer method example

In the example Josh gives of the flawed random method that generates a positive random number with a given upper bound n, I don't understand the two of the flaws he states. The method from the book ...
Derek Mok's user avatar
  • 313
18 votes
3 answers
4k views

Probability of Outcomes Algorithm

I have a probability problem, which I need to simulate in a reasonable amount of time. In simplified form, I have 30 unfair coins each with a different known probability. I then want to ask things ...
Kenny's user avatar
  • 183
18 votes
2 answers
10k views

Sigmoid output - can it be interpreted as probability?

Sigmoid function outputs a number between 0 and 1. Is this a probability or is it merely a 'yes or no' depending on whether it's above or below 0.5? Minimal example: Cats vs dogs binary ...
Voy's user avatar
  • 5,744
18 votes
3 answers
776 views

How was there no collision among 50,000 random 7-digit hex strings? (The Birthday Problem)

I've encountered some code that generates a number of UUIDs via UUID.randomUUID(), takes the last 7 digits of each (recent versions of UUID are uniformly distributed in terms of entropy), and uses ...
Andrew Cheong's user avatar
18 votes
3 answers
3k views

Combining individual probabilities in Naive Bayesian spam filtering

I'm currently trying to generate a spam filter by analyzing a corpus I've amassed. I'm using the wikipedia entry http://en.wikipedia.org/wiki/Bayesian_spam_filtering to develop my classification ...
Jeremy Giberson's user avatar
17 votes
8 answers
12k views

Creating your own Tinyurl style uid

I'm writing a small article on humanly readable alternatives to Guids/UIDs, for example those used on TinyURL for the url hashes (which are often printed in magazines, so need to be short). The ...
Chris S's user avatar
  • 65.1k
17 votes
5 answers
11k views

Generate random numbers distributed by Zipf

The Zipf probability distribution is often used to model file size distribution or item access distributions on items in P2P systems. e.g. "Web Caching and Zip like Distribution Evidence and ...
dmeister's user avatar
  • 35.1k
17 votes
6 answers
22k views

How to shorten UUID V4 without making it non-unique/guessable

I have to generate unique URL part which will be "unguessable" and "resistant" to brute force attack. It also has to be as short as possible :) and all generated values has to be of same length. I was ...
user606521's user avatar
  • 14.8k
16 votes
4 answers
24k views

Probability of 64bit Hash Code Collisions

The book Numerical Recipes offers a method to calculate 64bit hash codes in order to reduce the number of collisions. The algorithm is shown at http://www.javamex.com/tutorials/collections/...
isapir's user avatar
  • 22.3k
16 votes
3 answers
7k views

Probability and Neural Networks

Is it a good practice to use sigmoid or tanh output layers in Neural networks directly to estimate probabilities? i.e the probability of given input to occur is the output of sigmoid function in the ...
Betamoo's user avatar
  • 15.4k
16 votes
3 answers
25k views

Percentage Based Probability

I have this code snippet: Random rand = new Random(); int chance = rand.Next(1, 101); if (chance <= 25) // probability of 25% { Console.WriteLine("You win"); } else { Console.WriteLine("...
BlueRay101's user avatar
  • 1,467
16 votes
4 answers
2k views

Seeking suggestions for data representation of a probability distribution

I'm looking for an elegant and efficient way to represent and store an arbitrary probability distribution constructed by explicit sampling. The distribution is expected to have the following ...
George Skoptsov's user avatar
15 votes
15 answers
13k views

Is this a good or bad 'simulation' for Monty Hall? How come? [closed]

Through trying to explain the Monty Hall problem to a friend during class yesterday, we ended up coding it in Python to prove that if you always swap, you will win 2/3 times. We came up with this: ...
Josh Hunt's user avatar
  • 14.4k
15 votes
2 answers
9k views

Chance of a duplicate hash when using first 8 characters of SHA1

If I have an index of URLs, and ID them by the first 8 characters of a SHA1 hash, what is the probability of two different URLs having identical IDs?
zino's user avatar
  • 1,362
15 votes
3 answers
8k views

Fast weighted random selection from very large set of values

I'm currently working on a problem that requires the random selection of an element from a set. Each of the elements has a weight(selection probability) associated with it. My problem is that for ...
user avatar
15 votes
5 answers
4k views

Estimating/forecasting download completion time

We've all poked fun at the 'X minutes remaining' dialog which seems to be too simplistic, but how can we improve it? Effectively, the input is the set of download speeds up to the current time, and ...
Phil H's user avatar
  • 20k
15 votes
4 answers
12k views

Divide each each cell of large matrix by sum of its row

I have a site by species matrix. The dimensions are 375 x 360. Each value represents the frequency of a species in samples of that site. I am trying to convert this matrix from frequencies to ...
Zane.Lazare's user avatar
14 votes
3 answers
11k views

how to show that NDCG score is significant

Suppose the NDCG score for my retrieval system is .8. How do I interpret this score. How do i tell the reader that this score is significant?
Programmer's user avatar
  • 6,635
14 votes
2 answers
19k views

Implementation of sequential monte carlo method (particle filters)

I'm interested in the simple algorithm for particles filter given here: http://www.aiqus.com/upfiles/PFAlgo.png It seems very simple but I have no idea on how to do it practically. Any idea on how to ...
shn's user avatar
  • 5,206
14 votes
5 answers
3k views

Choose random array element satisfying certain property

Suppose I have a list, called elements, each of which does or does not satisfy some boolean property p. I want to choose one of the elements that satisfies p by random with uniform distribution. I ...
Paul Reiners's user avatar
  • 7,706
14 votes
3 answers
15k views

How to generate random numbers with predefined probability distribution?

I would like to implement a function in python (using numpy) that takes a mathematical function (for ex. p(x) = e^(-x) like below) as input and generates random numbers, that are distributed according ...
ZelelB's user avatar
  • 1,884
14 votes
6 answers
3k views

Implementation of a simple algorithm (to calculate probability)

I've been asked (as part of homework) to design a Java program that does the following: Basically there are 3 cards: Black coloured on both sides Red coloured on both sides Black on one side, red on ...
James's user avatar
  • 143
14 votes
2 answers
3k views

Plot probability heatmap/hexbin with different sized bins

This is related to another question: Plot weighted frequency matrix. I have this graphic (produced by the code below in R): #Set the number of bets and number of trials and % lines numbet <- 36 ...
13 votes
6 answers
17k views

How to choose keys from a python dictionary based on weighted probability? [duplicate]

I have a Python dictionary where keys represent some item and values represent some (normalized) weighting for said item. For example: d = {'a': 0.0625, 'c': 0.625, 'b': 0.3125} # Note that sum([v ...
Joseph's user avatar
  • 13k
13 votes
4 answers
18k views

Chance of winning PHP percent calculation

I have a "battle" system, the attacker has a battle strength of e.g. 100, the defender has a strength of e.g. 75. But I'm stuck now, I can't figure out how to find the winner. I know the attacker has ...
Sims's user avatar
  • 177
13 votes
3 answers
22k views

c# probability and random numbers

I want to trigger an event with a probability of 25% based on a random number generated between 1 and 100 using: int rand = random.Next(1,100); Will the following achieve this? if (rand<=25) { ...
CdrTomalak's user avatar
13 votes
6 answers
6k views

Probabilty based on quicksort partition

I have come across this question: Let 0<α<.5 be some constant (independent of the input array length n). Recall the Partition subroutine employed by the QuickSort algorithm, as explained in ...
POOJA GUPTA's user avatar
  • 2,315
13 votes
2 answers
5k views

Get true or false with a given probability

I'm trying to write a function in c++ that will return true or false based on a probability given. So, for example if the probability given was 0.634 then, 63.4% of the time the function would return ...
user2317084's user avatar

1
2
3 4 5
82