41

I am having trouble interpreting the results of a logistic regression. My outcome variable is Decision and is binary (0 or 1, not take or take a product, respectively).
My predictor variable is Thoughts and is continuous, can be positive or negative, and is rounded up to the 2nd decimal point.
I want to know how the probability of taking the product changes as Thoughts changes.

The logistic regression equation is:

glm(Decision ~ Thoughts, family = binomial, data = data)

According to this model, Thoughts has a significant impact on probability of Decision (b = .72, p = .02). To determine the odds ratio of Decision as a function of Thoughts:

exp(coef(results))

Odds ratio = 2.07.

Questions:

  1. How do I interpret the odds ratio?

    1. Does an odds ratio of 2.07 imply that a .01 increase (or decrease) in Thoughts affect the odds of taking (or not taking) the product by 0.07 OR
    2. Does it imply that as Thoughts increases (decreases) by .01, the odds of taking (not taking) the product increase (decrease) by approximately 2 units?
  2. How do I convert odds ratio of Thoughts to an estimated probability of Decision?
    Or can I only estimate the probability of Decision at a certain Thoughts score (i.e. calculate the estimated probability of taking the product when Thoughts == 1)?

2

4 Answers 4

88

The coefficient returned by a logistic regression in r is a logit, or the log of the odds. To convert logits to odds ratio, you can exponentiate it, as you've done above. To convert logits to probabilities, you can use the function exp(logit)/(1+exp(logit)). However, there are some things to note about this procedure.

First, I'll use some reproducible data to illustrate

library('MASS')
data("menarche")
m<-glm(cbind(Menarche, Total-Menarche) ~ Age, family=binomial, data=menarche)
summary(m)

This returns:

Call:
glm(formula = cbind(Menarche, Total - Menarche) ~ Age, family = binomial, 
    data = menarche)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-2.0363  -0.9953  -0.4900   0.7780   1.3675  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept) -21.22639    0.77068  -27.54   <2e-16 ***
Age           1.63197    0.05895   27.68   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 3693.884  on 24  degrees of freedom
Residual deviance:   26.703  on 23  degrees of freedom
AIC: 114.76

Number of Fisher Scoring iterations: 4

The coefficients displayed are for logits, just as in your example. If we plot these data and this model, we see the sigmoidal function that is characteristic of a logistic model fit to binomial data

#predict gives the predicted value in terms of logits
plot.dat <- data.frame(prob = menarche$Menarche/menarche$Total,
                       age = menarche$Age,
                       fit = predict(m, menarche))
#convert those logit values to probabilities
plot.dat$fit_prob <- exp(plot.dat$fit)/(1+exp(plot.dat$fit))

library(ggplot2)
ggplot(plot.dat, aes(x=age, y=prob)) + 
  geom_point() +
  geom_line(aes(x=age, y=fit_prob))

enter image description here

Note that the change in probabilities is not constant - the curve rises slowly at first, then more quickly in the middle, then levels out at the end. The difference in probabilities between 10 and 12 is far less than the difference in probabilities between 12 and 14. This means that it's impossible to summarise the relationship of age and probabilities with one number without transforming probabilities.

To answer your specific questions:

How do you interpret odds ratios?

The odds ratio for the value of the intercept is the odds of a "success" (in your data, this is the odds of taking the product) when x = 0 (i.e. zero thoughts). The odds ratio for your coefficient is the increase in odds above this value of the intercept when you add one whole x value (i.e. x=1; one thought). Using the menarche data:

exp(coef(m))

 (Intercept)          Age 
6.046358e-10 5.113931e+00 

We could interpret this as the odds of menarche occurring at age = 0 is .00000000006. Or, basically impossible. Exponentiating the age coefficient tells us the expected increase in the odds of menarche for each unit of age. In this case, it's just over a quintupling. An odds ratio of 1 indicates no change, whereas an odds ratio of 2 indicates a doubling, etc.

Your odds ratio of 2.07 implies that a 1 unit increase in 'Thoughts' increases the odds of taking the product by a factor of 2.07.

How do you convert odds ratios of thoughts to an estimated probability of decision?

You need to do this for selected values of thoughts, because, as you can see in the plot above, the change is not constant across the range of x values. If you want the probability of some value for thoughts, get the answer as follows:

exp(intercept + coef*THOUGHT_Value)/(1+(exp(intercept+coef*THOUGHT_Value))
6
  • 1
    Thank you so much! Your additional example really helped put your explanation into context.
    – Sudy Majd
    Jan 3, 2017 at 21:51
  • 1
    @SudyMajd Welcome to SO! If you accept triddle´s answer, please click the green mark beside the answer. Doing so, you honour the person who answered and mark the question as solved.
    – pat-s
    Jan 8, 2017 at 21:44
  • 3
    @Emily If you have scaled predictors, then the interpretation is the same, except the 'one unit change' means 1 standard deviation. If you want values for scaled and unscaled predictors, it's probably easiest just to fit two separate models: one with them scaled, and one with them unscaled.
    – triddle
    Dec 12, 2018 at 20:12
  • 1
    the exponential of intercept and age coefficients are not odds ratios. only exponential of coefficients related to terms of factor variables can be considered as odds ratios. This answer is missleading Jul 24, 2020 at 15:38
  • 1
    Is the interpretation the same if you've taken the exponential of the coefficient of an interaction term? For example doing exp(coef(model)) that includes salinity*temperature variables, = 0.987 -> is basically no change in the odds of the predictor?
    – Nate
    Mar 14, 2022 at 20:38
29

Odds and probability are two different measures, both addressing the same aim of measuring the likeliness of an event to occur. They should not be compared to each other, only among themselves!
While odds of two predictor values (while holding others constant) are compared using "odds ratio" (odds1 / odds2), the same procedure for probability is called "risk ratio" (probability1 / probability2).

In general, odds are preferred against probability when it comes to ratios since probability is limited between 0 and 1 while odds are defined from -inf to +inf.

To easily calculate odds ratios including their confident intervals, see the oddsratio package:

library(oddsratio)
fit_glm <- glm(admit ~ gre + gpa + rank, data = data_glm, family = "binomial") 

# Calculate OR for specific increment step of continuous variable
or_glm(data = data_glm, model = fit_glm, 
       incr = list(gre = 380, gpa = 5))

  predictor oddsratio CI.low (2.5 %) CI.high (97.5 %)          increment
1       gre     2.364          1.054            5.396                380
2       gpa    55.712          2.229         1511.282                  5
3     rank2     0.509          0.272            0.945 Indicator variable
4     rank3     0.262          0.132            0.512 Indicator variable
5     rank4     0.212          0.091            0.471 Indicator variable

Here you can simply specify the increment of your continuous variables and see the resulting odds ratios. In this example, the response admit is 55 times more likely to occur when predictor gpa is increased by 5.

If you want to predict probabilities with your model, simply use type = response when predicting your model. This will automatically convert log odds to probability. You can then calculate risk ratios from the calculated probabilities. See ?predict.glm for more details.

7
  • 1
    I've found this package very useful, In the current documentation I think you would have to use or_glm rather than calc.oddsratio.glm
    – Silverfish
    Jan 23, 2018 at 23:58
  • Thanks, I updated the code. Glad you find the package useful!
    – pat-s
    Jan 24, 2018 at 0:01
  • Hadn't realised you were the author! Many thanks for providing it!
    – Silverfish
    Jan 24, 2018 at 19:46
  • you say that 'odds are defined from -inf to +inf.', but aren't they constrained between 0 and inf? what would a negative odds ratio mean? Mar 26, 2020 at 3:18
  • Thanks! What does increment mean here? I tried running this code but I do not know how to select value of increment. Is there an automatized approach to do this?
    – Aby
    Jul 31, 2020 at 11:25
0

I found this epiDisplay package, works fine! It might be useful for others but note that your confidence intervals or exact results will vary according to the package used so it is good to read the package details and chose the one that works well for your data.

Here is a sample code:

library(epiDisplay)
data(Wells, package="carData")
glm1 <- glm(switch~arsenic+distance+education+association, 
            family=binomial, data=Wells)
logistic.display(glm1)

Source website

-2

The above formula to logits to probabilities, exp(logit)/(1+exp(logit)), may not have any meaning. This formula is normally used to convert odds to probabilities. However, in logistic regression an odds ratio is more like a ratio between two odds values (which happen to already be ratios). How would probability be defined using the above formula? Instead, it may be more correct to minus 1 from the odds ratio to find a percent value and then interpret the percentage as the odds of the outcome increase/decrease by x percent given the predictor.

1
  • exp(x)/(1+exp(x)) is the inverse logit function. This formula is used to convert log odds to probabilities, if used appropriately, you can obtain probability estimates for different values of covariates in a logistic regression Jul 24, 2020 at 15:43

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.