# Category : Science and Education

Archive : EPI_PAK.ZIP

Filename : PROPCI.DOC

PROPCI

A Program that Calculates Confidence Intervals for a Proportion

Version 1.3, 8/June/1990

(c) 1988, 1989, 1990

by

Kevin M. Sullivan

Division of Nutrition

Center for Chronic Disease Prevention and Health Promotion

Centers for Disease Control

1600 Clifton Road NE, MS A08

Atlanta, GA 30333

This program was developed to calculate confidence intervals for a

proportion by use of the following methods: the normal approximation to

the binomial, the normal approximation with a correction factor, the

method by Wilson, the quadratic method, the exact binomial method, and

the mid-p (Miettinen) method. Either 90, 95, or 99 percent confidence

intervals can be calculated. The user inputs the number of individuals

who have the event of interest (x) and the sample size (n). The point

estimate is x/n. The standard error (SE) of the normal approximation to

the binomial(1,2) is:

____

SE = \/pq/n

where

p=x/n

q=1-p

n=denominator

The 95% confidence interval for the normal approximation is:

p + Z * SE

where

Z = Z value, e.g., for 95% two-sided CI this is 1.96

SE = the standard error calculated above

The lower and upper confidence limits for a normal approximation with

a correction factor(2) are:

p - Z * SE - 1/(2n)

and

p + Z * SE + 1/(2n)

The normal approximation to the binomial can produce estimates

outside of the 0-100 percent limits. PROPCI will provide the results of

the normal approximation calculations even if the estimates are outside

the limits, although most authors truncate the estimates to the limits.

One suggested criterion for determining when the normal approximation is

inappropriate is when npq<5.(2) A more correct approximation for the

confidence interval for a proportion is calculated using the quadratic

method.(1) This quadratic formula includes a correction factor. The

formula for the lower bound of the quadratic method is:

____________________________

2 | 2

(2np + Z - 1) - Z \| Z - (2 + 1/n) + 4p(nq + 1)

_________________________________________________

2

2(n + Z )

and the upper bound is:

____________________________

2 | 2

(2np + Z + 1) + Z \| Z + (2 - 1/n) + 4p(nq - 1)

_________________________________________________

2

2(n + Z )

Another approximate method is by Wilson.(3) It appears that the method

by Wilson is a quadratic equation without the correction factor. The

formula for the lower bound is:

_ _____________ _

| 2 | 2 |

n | x Z | x(n-x) Z |

______ | ___ + __ | ______ - ___ |

2 | | 3 2 |

n+Z |_ n 2n \| n 4n _|

and the upper bound is:

_ _____________ _

| 2 | 2 |

n | x Z | x(n-x) Z |

______ | ___ + __ | ______ + ___ |

2 | | 3 2 |

n+Z |_ n 2n \| n 4n _|

The exact binomial confidence interval is calculated by using

formulas as described by Rosner2 and Rothman.(3) The formulas for the

lower and upper limits for a two-sided 95% confidence interval (i.e.,

.025 in each tail) are:

n

___ n! k n-k

.025 = \ -------- p (1-p )

/__ k!(n-k)! 1 1

k=x

x

___ n! k n-k

.025 = \ -------- p (1-p )

/__ k!(n-k)! 2 2

k=0

Exact mid-p (Miettinen) confidence intervals are calculated by using

formulas as described by Rothman.(3) The formula for the lower and upper

limits for a two-sided 95% confidence interval (i.e., .025 in each tail)

are:

n

1 n! x n-x ___ n! k n-k

.025 = - * -------- p (1-p ) + \ -------- p (1-p )

2 x!(n-x)! 1 1 /__ k!(n-k)! 1 1

k=x+1

x-1

1 n! x n-x ___ n! k n-k

.025 = - * -------- p (1-p ) + \ -------- p (1-p )

2 x!(n-x)! 2 2 /__ k!(n-k)! 2 2

k=0

For each proportion, the normal approximation (with and without

correction factor), Wilson, and quadratic confidence intervals are

automatically provided. If npq<5, a message is provided near the bottom

of the screen warning users that the normal approximation may not be

appropriate. Next, the user is then prompted as to whether they would

like to have exact confidence intervals calculated (the default is

"no"). Both the exact binomial and mid-p formulas require iterative

solutions to determine the value of lower and upper confidence limits

and therefore are not automatically performed. A fast method to

determine the exact confidence intervals using the F-distribution is

used when the denominator is less than 300.(3) Finally, the user is

asked whether they would like to perform another calculation or return

to DOS.

Which confidence interval method should you use? My opinion is that

among the approximate methods (i.e., normal approximation, Wilson, and

quadratic), the quadratic provides the best estimate of the exact

binomial confidence interval. If the data are sparse, then use one of

the exact methods (exact binomial or mid-p).

EXAMPLE

In this example from Rothman,3 x=10 and n=11 with 90% confidence

intervals.

+--------------------------------------------------------------------+

| 01/09/90 ** PROPCI 1.2 ** |

+--------------------------------------------------------------------+

Numerator: 10 / Denominator: 11

Enter two-sided confidence level (90, 95, or 99%): 90

The point estimate is: 90.909%

Confidence Interval Method Std Error 90% CI

Normal Approx. to the Binomial 8.668 76.650, 105.168

Normal Approx. with Correction Factor 8.668 72.105, 109.713

Wilson Method 67.719, 97.945

Quadratic Method 62.330, 99.372

Exact Binomial 63.564, 99.535

Miettinen Limits (Mid-p) 67.759, 99.090

**The normal approximation may not be valid for this example**

Would you like to do another? (Y/N) Y

DIFFERENCES BETWEEN VERSION 1.3 AND PREVIOUS VERSIONS

Version 1.2 implements the F-distribution method to arrive at the

exact confidence limits. This dramatically reduces the computation time

involved compared to other iterative procedures and produces the exact

same results.

Because of some problems with the F-distribution method, in version

1.3 the F-distribution method is used only when the denominator is less

than 300; the longer iterative method is used with larger numbers.

DISTRIBUTION CONDITIONS

NON-WARRANTY. PROPCI is provided "as is" and without any warranty

expressed or implied. The user assumes all risks of the use of PROPCI.

PROPCI may not run on your particular hardware/software configuration.

We bear no responsibility for any mishap or economic loss resulting

therefrom the use of this software.

COPYRIGHT CONDITIONS. You may make and distribute copies of PROPCI

provided that there is no material gain involved.

USE AT YOUR OWN RISK. All risk of loss of any kind due to the use of

PROPCI is with you, the user. You are responsible for all mishaps, even

if the program proves to be defective. This program makes certain

assumptions about the data. These assumptions affect the validity of

conclusions made based on the output from this program.

Please acknowledge PROPCI in any manuscript that uses its

calculations.

REFERENCES

1. Fleiss JL. Statistical Methods for Rates and Proportions, 2nd Ed.

John Wiley & Sons, New York, 1981.

2. Rosner B. Fundamentals of Biostatistics. Duxbury Press, Boston,

1982.

3. Rothman KJ, Boice JD Jr: Epidemiologic analysis with a

programmable calculator. NIH Pub No. 79-1649. Bethesda, MD:

National Institutes of Health, 1979;31-32.