26

I know that I need mean and s.d to find the interval, however, what if the question is:

For a survey of 1,000 randomly chosen workers, 520 of them are female. Create a 95% confidence interval for the proportion of workers who are female based on the survey.

How do I find mean and s.d for that?

1

4 Answers 4

35

You can also use prop.test from package stats, or binom.test

prop.test(x, n, conf.level=0.95, correct = FALSE)

        1-sample proportions test without continuity correction

data:  x out of n, null probability 0.5
X-squared = 1.6, df = 1, p-value = 0.2059
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
 0.4890177 0.5508292
sample estimates:
   p 
0.52 

You may find interesting the article TWO-SIDED CONFIDENCE INTERVALS FOR THE SINGLE PROPORTION: COMPARISON OF SEVEN METHODS, where in Table 1 on page 861 are given different confidence intervals, for a single proportion, calculated using seven methods (for selected combinations of n and r). Using prop.test you can get the results found in rows 3 and 4 of the table, while binom.test returns what you see in row 5.

5
  • Nice answer, and it doesn't require any external packages. Feb 12, 2014 at 22:13
  • @thelatemail This is probably a dumb question, but how do you take that 95% CI and turn it into a SE and then an SD?
    – Alexander
    Jan 14, 2016 at 17:12
  • prop.test gives very strange results. If you compare it with SAS. I would prefer to use binconf from Hmisc package (see @Zbynek answer) with known method for CI calculation.
    – crow16384
    Jun 21, 2021 at 7:44
  • The link is broken
    – Julien
    May 24, 2023 at 7:57
  • @Julien I think I found it. I've updated the link.
    – Yorgos
    May 25, 2023 at 14:08
23

In this case, you have binomial distribution, so you will be calculating binomial proportion confidence interval.

In R, you can use binconf() from package Hmisc

> binconf(x=520, n=1000)
 PointEst     Lower     Upper
     0.52 0.4890177 0.5508292

Or you can calculate it yourself:

> p <- 520/1000
> p + c(-qnorm(0.975),qnorm(0.975))*sqrt((1/1000)*p*(1-p))
[1] 0.4890345 0.5509655
2
  • What would q-norm be is you use 99% confidence interval? Sep 12, 2019 at 10:26
  • qnorm(0.99) is 2.326348
    – Zbynek
    Sep 16, 2019 at 8:49
23

Alternatively, use function propCI from the prevalence package, to get the five most commonly used binomial confidence intervals:

> library(prevalence)
> propCI(x = 520, n = 1000)
    x    n    p        method level     lower     upper
1 520 1000 0.52 agresti.coull  0.95 0.4890176 0.5508293
2 520 1000 0.52         exact  0.95 0.4885149 0.5513671
3 520 1000 0.52      jeffreys  0.95 0.4890147 0.5508698
4 520 1000 0.52          wald  0.95 0.4890351 0.5509649
5 520 1000 0.52        wilson  0.95 0.4890177 0.5508292
6

Another package: tolerance will calculate confidence / tolerance ranges for a ton of typical distribution functions.

1
  • wow, that package tolerance is comprehensive & thorough. outstanding recommendation!
    – cmo
    Mar 20, 2019 at 13:57

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.