All Questions
Tagged with probability machine-learning
166
questions
43
votes
2
answers
82k
views
scikit-learn return value of LogisticRegression.predict_proba
What exactly does the LogisticRegression.predict_proba function return?
In my example I get a result like this:
array([
[4.65761066e-03, 9.95342389e-01],
[9.75851270e-01, 2.41487300e-02],
[...
36
votes
3
answers
52k
views
How to get a classifier's confidence score for a prediction in sklearn?
I would like to get a confidence score of each of the predictions that it makes, showing on how sure the classifier is on its prediction that it is correct.
I want something like this:
How sure is ...
30
votes
5
answers
31k
views
Create Bayesian Network and learn parameters with Python3.x [closed]
I'm searching for the most appropriate tool for python3.x on Windows to create a Bayesian Network, learn its parameters from data and perform the inference.
The network structure I want to define ...
19
votes
2
answers
25k
views
How does the predict_proba() function in LightGBM work internally?
This is in reference to understanding, internally, how the probabilities for a class are predicted using LightGBM.
Other packages, like sklearn, provide thorough detail for their classifiers. For ...
18
votes
2
answers
10k
views
Sigmoid output - can it be interpreted as probability?
Sigmoid function outputs a number between 0 and 1. Is this a probability or is it merely a 'yes or no' depending on whether it's above or below 0.5?
Minimal example:
Cats vs dogs binary ...
16
votes
3
answers
7k
views
Probability and Neural Networks
Is it a good practice to use sigmoid or tanh output layers in Neural networks directly to estimate probabilities?
i.e the probability of given input to occur is the output of sigmoid function in the ...
13
votes
3
answers
18k
views
Multiple Output Neural Network
I have built my first neural network in python, and i've been playing around with a few datasets; it's going well so far !
I have a quick question regarding modelling events with multiple outcomes: -
...
11
votes
1
answer
12k
views
Probability prediction method of KNeighborsClassifier returns only 0 and 1
Can anyone tell me what's the problem with my code?
Why I can predict probability of iris dataset by using LinearRegression but, KNeighborsClassifier gives me 0 or 1 while it should give me a result ...
8
votes
3
answers
301
views
Way to infer the size of the userbase of a site from sampling taken usernames
Suppose you wanted to estimate the size of a userbase of a site which does not publicize this information.
People are more likely to have acquired different usernames with different probabilities. ...
6
votes
1
answer
2k
views
understanding sklearn calibratedClassifierCV
Hi all I am having trouble understanding how to use the output of sklearn.calibration.CalibratedClassifierCV.
I have calibrated my binary classifier using this method, and results are greatly improved....
6
votes
3
answers
2k
views
Conversion of IsolationForest decision score to probability algorithm
I am looking to create a generic function to convert the output decision_scores of sklearn's IsolationForest into true probabilities [0.0, 1.0].
I am aware of, and have read, the original paper and I ...
5
votes
2
answers
8k
views
Probability basics for machine learning [closed]
I have recently started studying Machine Learning and found that I need to refresh probability basics such as Conditional Probability, Bayes Theorem etc.
I am looking for online resources where I can ...
5
votes
2
answers
1k
views
Fastest approximate counting algorithm
Whats the fastest way to get an approximate count of number of rows of an input file or std out data stream. FYI, this is a probabilistic algorithm, I can't find many examples online.
The data could ...
5
votes
1
answer
11k
views
predict_proba() method of Keras model does not exist
I am trying to generate class scores by calling predict_proba() of Keras model, but it seems that this function does not exist! Is it deprecated because I see some examples in Google? I am using Keras ...
5
votes
1
answer
1k
views
Having trouble understanding sklearn's SVM's predict_proba function
I am having trouble understanding a function from sklearn and would like some clarification. At first I thought that sklearn's SVM's predict_proba function gave out the level of confidence of the ...
4
votes
2
answers
14k
views
sklearn - Predict each class's probability
So far I have resourced another post and sklearn documentation
So in general I want to produce the following example:
X = np.matrix([[1,2],[2,3],[3,4],[4,5]])
y = np.array(['A', 'B', 'B', 'C', 'D'])
...
4
votes
1
answer
17k
views
The best way to calculate classification accuracy?
I know one formula to calculate classification accuracy is X = t / n * 100 (where t is the number of correct classification and n is the total number of samples. )
But, let's say we have total 100 ...
4
votes
2
answers
5k
views
How to compute the probability of a multi-class prediction using libsvm?
I'm using libsvm and the documentation leads me to believe that there's a way to output the believed probability of an output classification's accuracy. Is this so? And if so, can anyone provide a ...
4
votes
5
answers
2k
views
How do I efficiently estimate a probability based on a small amount of evidence?
I've been trying to find an answer to this for months (to be used in a machine learning application), it doesn't seem like it should be a terribly hard problem, but I'm a software engineer, and math ...
4
votes
1
answer
1k
views
Why do we choose Beta distribution as a prior on hypothesis?
I saw machine learning class videos of course 10-701 year 2011 by Tom Mitchell at CMU. He was teaching on topic Maximum Likelihood Estimation when he used Beta distribution as prior on theta, I wonder ...
4
votes
4
answers
2k
views
Neural Network Input Order
This may seem like a silly question.
I am running a neural network through some tennis data. The objective of the network is to determine the probability of each player winning the match. There are ...
4
votes
2
answers
185
views
Analysis of sorting Algorithm with probably wrong comparator?
It is an interesting question from an Interview, I failed it.
An array has n different elements [A1 .. A2 .... An](random order).
We have a comparator C, but it has a probability p to return correct ...
3
votes
1
answer
1k
views
naive classifier matlab
When testing the naive classifier in matlab I get different results even though I trained and tested on the same sample data, I was wondering if my code is correct and if someone could help explain ...
3
votes
2
answers
4k
views
How to check if sample has same probability distribution as population in Python?
I have a Dataframe with millions of rows, to create a model, I have taken a random sample from this dataset using dataset.sample(int(len(dataset)/5)) which returns a random sample of items from an ...
3
votes
1
answer
1k
views
GMM - loglikelihood isn't monotonic
Yesterday I implemented a GMM (Gaussian Mixture Model) using expectation-maximization algorithm.
As you remember, it models some uknown distribution as a mixture of gaussians which we need to learn ...
3
votes
3
answers
2k
views
what are the largest and smallest numbers between 0 and 1 that C++ can represent internally without rounding?
I have a C++ function which computes probabilities based on a simple model. It seems that C++ tends to round very small probabilities to 0 and very large probabilities to 1. This results in issues in ...
3
votes
1
answer
3k
views
Trying to understand expected value in Linear Regression
I'm having trouble understanding a lecture slide in my school's machine learning course
why does the expected value of Y = f(X)? what does it mean
my understanding is that X, Y are vectors and f(X) ...
3
votes
1
answer
3k
views
calculating confidence while doing classification
I am using a Naive Bayes algorithm to predict movie ratings as positive or negative. I have been able to rate movies with 81% accuracy. I am, however, trying to assign a 'confidence level' for each of ...
3
votes
1
answer
1k
views
After reducing the dimensionality of a dataset, I am getting negative feature values
I used a Dimensionality Reduction method (discussion here: Random projection algorithm pseudo code) on a large dataset.
After reducing the dimension from 1000 to 50, I get my new dataset where each ...
3
votes
1
answer
3k
views
Log likelihood to implement Naive Bayes for Text Classification
I am implementing Naive Bayes algorithm for text classification. I have ~1000 documents for training and 400 documents for testing. I think I've implemented training part correctly, but I am confused ...
3
votes
1
answer
6k
views
Negative BIC values for GaussianMixture in scikit-learn (sklearn)
In scikit-learn, the GaussianMixture object has the method bic(X) that implements the Bayesian Information Criterion to choose the number of components that better fits the data.
This is an example of ...
3
votes
1
answer
3k
views
How does sklearn's MLP predict_proba function work internally?
I am trying to understand how sklearn's MLP Classifier retrieves its results for its predict_proba function.
The website simply lists:
Probability estimates
While many others, such as logistic ...
3
votes
1
answer
417
views
Should Naive Bayes multiple all the word in the vocabulary
I am using Naive Bayes in text classification.
Assume that my vocabulary is ["apple","boy","cup"] and the class label is "spam" or "ham". Each document will be covered to a 3-dimentional 0-1 vector. ...
3
votes
2
answers
550
views
Reinforcement learning And POMDP
I am trying to use Multi-Layer NN to implement probability function in Partially Observable Markov Process..
I thought inputs to the NN would be: current state, selected action, result state;
The ...
3
votes
0
answers
428
views
Calculate the likelihood of a function given a Gaussian Process model
I am fitting a Gaussian process regression using scikit-learn. (mine is actually a simple 1 dimensional case)
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn....
3
votes
0
answers
340
views
Is there any metric to evaluate output probabilities' precision in classification models?
I am currently developing a model in Python and Keras for a binary classification task (success/failure). My aim is to generate success probabilities for each observation so that I can use them later ...
3
votes
0
answers
149
views
Maximize AUC of a classifier having a set of probabilities that the object belongs to class
Consider a binary classification task with two target classes -- {men, women}. For each person you've got a set of their actions and for each action you've got a set of real-valued features. The goal ...
2
votes
1
answer
4k
views
Get risk predictions in WEKA using own Java code
I already checked the "Making predictions" documentation of WEKA and it contains explicit instructions for command line and GUI predictions.
I want to know how to get a prediction value like the one ...
2
votes
3
answers
430
views
How to test the quality of a probabilities estimator?
I created a heuristic (an ANN, but that's not important) to estimate the probabilities of an event (the results of sports games, but that's not important either). Given some inputs, this heuristics ...
2
votes
1
answer
4k
views
Unsupervised Naive Bayes - how does it work?
So as I understand it, to implement an unsupervised Naive Bayes, we assign random probability to each class for each instance, then run it through the normal Naive Bayes algorithm. I understand that, ...
2
votes
1
answer
34
views
Do regression algorithms give you a probability associated to the predicted value?
I am looking for an algorithm to predict an amount of money (a real value), therefore I am thinking of using a regression algorithm. However, I also need to know the probability associated to that ...
2
votes
2
answers
1k
views
Determine the Initial Probabilities of an HMM
So I have managed to estimate most of the parameters in a particular Hidden Markov Model (HMM) given the learn dataset. These parameters are: the emission probabilities of the hidden states and the ...
2
votes
2
answers
930
views
Effect of Number of States in a Hidden Markov Model based classifier
What is the relation between the number of clusters/codebook, number of states in a hidden markov model
How do number of states affect the performance of hidden markov model based classifier?
2
votes
1
answer
113
views
Information Modeling
The sensor module in my project consists of a rotating camera, that collects noisy information about moving objects in the surrounding environment.
The information consists of distance, angle and ...
2
votes
1
answer
1k
views
How to identify the modes in a (multimodal) continuous variable
What is the best method for finding all the modes in a continuous variable? I'm trying to develop a java or python algorithm for doing this.
I was thinking about using kernel density estimation, for ...
2
votes
1
answer
852
views
Predicting probabilities
I have time series data consisting of a vector
v=(x_1,…, x_n)
of binary categorical variables and the probabilities for four outcomes
p_1, p_2, p_3, p_4.
Given a new vector of categorical ...
2
votes
1
answer
2k
views
Calculation of probabilities in Naive Bayes in C#
I'm working on a Naive Bayes solution for C# where there are two possible outcomes. I have found a small sample code but was wondering if anyone would be able to explain the last line.
The analyzer ...
2
votes
2
answers
2k
views
Learning a binary classifier which outputs probability
When, in general, the objective is to build a binary classifier which outputs the probability that an instance is positive, which machine learning would be the most appropriate and in which situation? ...
2
votes
1
answer
1k
views
Get Confidence probability Scores for each Predicted Result in Catboost Classifier
I have built a machine learning model using Catboost classifier to predict the categoryname of my result as per below screenshot1. However, if I get an unknown as input or any input with which the ...
2
votes
0
answers
204
views
HMM - Does Foward-Backward algorithm has the same result as Viterbi if all transitions are possible?
I am attending a Bioinformatics class and we are learning about HMMs to make inference about DNA sequences.
Well, we recently learned about the forward-backward algorithm that gives us the ...