Sample Size Calculator

Determines the minimum number of subjects for adequate study power, clincalc.com » statistics » sample size calculator, study group design.

Two independent study groups

One study group vs. population

Primary Endpoint

Dichotomous (yes/no)

Continuous (means)

Statistical Parameters

Dichotomous endpoint, two independent sample study, about this calculator.

This calculator uses a number of different equations to determine the minimum number of subjects that need to be enrolled in a study in order to have sufficient statistical power to detect a treatment effect. 1

Before a study is conducted, investigators need to determine how many subjects should be included. By enrolling too few subjects, a study may not have enough statistical power to detect a difference (type II error). Enrolling too many patients can be unnecessarily costly or time-consuming.

Generally speaking, statistical power is determined by the following variables:

  • Baseline Incidence: If an outcome occurs infrequently, many more patients are needed in order to detect a difference.
  • Population Variance: The higher the variance (standard deviation), the more patients are needed to demonstrate a difference.
  • Treatment Effect Size: If the difference between two treatments is small, more patients will be required to detect a difference.
  • Alpha: The probability of a type-I error -- finding a difference when a difference does not exist. Most medical literature uses an alpha cut-off of 5% (0.05) -- indicating a 5% chance that a significant difference is actually due to chance and is not a true difference.
  • Beta: The probability of a type-II error -- not detecting a difference when one actually exists. Beta is directly related to study power (Power = 1 - β). Most medical literature uses a beta cut-off of 20% (0.2) -- indicating a 20% chance that a significant difference is missed.

Post-Hoc Power Analysis

To calculate the post-hoc statistical power of an existing trial, please visit the post-hoc power analysis calculator .

References and Additional Reading

  • Rosner B. Fundamentals of Biostatistics . 7th ed. Boston, MA: Brooks/Cole; 2011.

Related Calculators

  • Post-hoc Power Calculator

New and Popular

Cite this page.

Show AMA citation

We've filled out some of the form to show you this clinical calculator in action. Click here to start from scratch and enter your own patient data.

  • Research article
  • Open access
  • Published: 19 November 2001

Sample size requirements for case-control study designs

  • Michael D Edwardes 1  

BMC Medical Research Methodology volume  1 , Article number:  11 ( 2001 ) Cite this article

100k Accesses

17 Citations

3 Altmetric

Metrics details

Published formulas for case-control designs provide sample sizes required to determine that a given disease-exposure odds ratio is significantly different from one, adjusting for a potential confounder and possible interaction.

The formulas are extended from one control per case to F controls per case and adjusted for a potential multi-category confounder in unmatched or matched designs. Interactive FORTRAN programs are described which compute the formulas. The effect of potential disease-exposure-confounder interaction may be explored.

Conclusions

Software is now available for computing adjusted sample sizes for case-control designs.

Peer Review reports

Breslow and Day [ 1 ] and Smith and Day [ 2 ] provide asymptotic formulas for the computation of case-control sample sizes required for odds ratios, unadjusted or adjusted for a confounder [ 1 ] and for stratified matched designs [ 2 ]. The notation we use is their notation. Their formulas are extended here to include more than one control per case. The formulas for stratified matched were deduced from applying the approach of Breslow and Day [ 1 ] (pages 305–6) to Table 7 of [ 2 ]. Modification of the formulas for specified interactions [ 1 , 3 ] is also shown. These formulas are based on the logarithm of the odds ratio, for which the normal approximation is more accurate than for the exposure difference, so these formulas are more accurate than the exposure difference formula that is given in the majority of general methods references [ 4 , 5 ].

Two conversational FORTRAN programs, DAYSMITH and DESIGN, compute the formulas. They were submitted to STATLIB for non-commercial distribution a few years ago, and are obtained with an e-mail message such as "send design.exe from general" to http://[email protected] . The programs produce a table of numbers of cases and controls required for a variety of specifications of Type I and Type II error, adjusted for the confounder, unadjusted, and adjusted for stratified matching, with the strata being the levels of the confounder. The two programs have different input requirements. Program DAYSMITH asks for exactly the items required for the Smith and Day formulas. Program DESIGN accepts alternative input that is converted in the program to the items required for the same formulas. The formulas used are shown in Appendix 1.

The input to program DAYSMITH

The sample sizes computed are for the detection of a given disease-exposure odds ratio, that is, the sample sizes at which a certain statistical test will reject the null hypothesis that the odds ratio is one. The input items are as follows:

R E = the odds ratio to be detected (typically a minimum value),

S = 1 or 2 for one-sided or two-sided type I error,

F = the number of controls per case,

P = the control population exposure probability, and

I = an indicator to request interaction adjustment.

Roughly speaking, interaction in statistics corresponds to effect modification in epidemiology. By not selecting an interaction adjustment, we effectively assume that the disease-exposure odds ratio does not differ across confounder levels. Interaction is discussed further below.

The number of confounder levels, denoted K is asked for next. If K = 1, unadjusted sample sizes only are computed, and no other input is required. Program DESIGN is identical to this point. For most applications, no confounder adjustment is required and so the program returns unadjusted sample sizes and is finished after a 1 is entered for K. The unadjusted formula [ 1 ] is more accurate than the usual unadjusted formulas [ 4 , 5 ], and may therefore produce different sample sizes than those.

If K > 1, one of the levels of the confounder is taken to be a reference level, and is referred to as level one. The order of the levels is otherwise immaterial. The input required next is three numbers for each of the K –1 remaining levels, p 1 i , p 2 i , and R Ci , i = 2,..., K , which are

p 1 i = Pr ( C i | E ) = among the exposed population, the proportion at level i of the confounder,

= among the unexposed population, the proportion at level i of the confounder, and

R Ci = the disease-confounder odds ratio (with confounder level i versus level 1).

For the reference level, we set R c 1 = 1 for the formulas that follow. We compute

Input for program DESIGN

Whereas DAYSMITH asks for the same input as requested in the original references [ 1 – 3 ], we found that alternative input made more sense for our initial applications [ 6 , 7 ], so a second program was written. The input for DESIGN is the same as for DAYSMITH up to the point after which the number of levels of the confounder, K , is asked for.

Again, one of the levels of the confounder is taken to be a reference level, and is referred to as level one. The input that is required next is one number for the reference level, r i , and then three (four when interaction is included) numbers for each of the K –1 remaining levels, r i , p i , and R Ci , i = 2,..., K , which are

r i = Pr ( E | C i ) = the probability of exposure at level i of the confounder,

p i = Pr ( C i ) = the probability of being in level i of the confounder, and

R Ci = the odds ratio of disease and confounder level i (versus level 1).

For the reference level, we again set R Ci = 1.

From Bayes Theorem, we compute

p 1i =r i p i /P and

p 2i =(1 – r i )p i /(1 – P).

We have one more input item than is actually required, and that is used for a check, where we can use the fact that

What we actually do is check the sum

The sum Δ is supposed to be equal to one. If it is not one, then we re-define and report

unless they are negative. An alternative used in earlier versions was to compute

and replace

for j = 1,2 and i = 1,..., K. This is equivalent to replacing

i = 1,..., K , which is how the program used to report the change.

An example, adjusting for a confounder

The following example is one of several computations performed for a published research protocol for a study of the association of oral contraceptive (OC) use with cardiovascular risks, controlling for age group [ 6 ]. A related protocol [ 7 ] has smoking as a confounder.

The numbers entered for P , r i , p i , and R Ci , i = 2,..., K, are all taken from the Saskatchewan government medical database, which includes the entire population from which a case-control sample is to be taken. In many applications, such numbers are not available from a reliable source. In that case, one may try sets of alternative minimum and maximum numbers for a range of results. The maximum sample sizes obtained from such sensitivity analyses would be the conservative recommendation.

Both programs first request R E to I. For R E , the outcome of interest is hospitalisation due to certain cardiovascular risks. The exposure is a specific OC with 10% of the market share [ 7 ]. Since overall OC prevalence is 30%, then P = .03 for that specific OC. Using > to denote the cursor for computer entry, we type:

>2 2 3 .03 0

for R E , S, F, P and I, respectively, then press enter. We then receive the message:

Type the number of confounder levels, and <enter>. Type 1 if no confounder.

We enter 5 levels and press enter.

Now type in the population exposure probability for the reference level of the confounding variable.

This will be put at level 1, so it is Pr(E|C1)

The confounder levels are five age groups, and level 1 corresponds to the youngest age group 15–21, for which we enter the prevalence for a specific OC with 10% of the market share. We type .055 and press enter.

The reply is:

Now type in, for each of the other 4 level(s) of the confounding variable, Pr(E|Ci), Pr(Ci), and Rc(i), separated by at least one blank or <enter>, where Pr(E|Ci) = in the population at level i of the confounder, the proportion exposed, Pr(Ci) = the probability of being at level i, and Rc(i) = odds ratio of disease and confounder level i (versus level 1).

The following numbers are entered for age groups 22–26, 27–31, 22–39 and 40+:

> .038 .24 2

> .021 .2 8

> .008 .18 8

> .004 .15 28.5

Note that Rc(5) = R C 5 = 28.5, a very high value. That is to be expected if all older women are included. (For the final protocol [ 6 ], a cut-off was made at age 45.) When enter is pressed, we receive some confirmation of the input, and a message that the result is written to file design.out. That is, as currently written, the sample sizes and other output are not automatically shown on the screen, but are saved in "design.out" to be viewed directly there. Appendix 2 (Second attached file, app2.txt, a text file) shows the output from the preceding session, which includes a correction of the input values.

Looking at Appendix 2, we see unadjusted sample sizes, those adjusted for age in an unmatched study, and a third set of sample sizes for a matched case-control study. For our example [ 6 ], both unmatched and matched designs are considered. With the low value of P and the high value R C 5 , we see that a large difference in sample sizes required for either design may result. In most applications, however, the differences are not so dramatic.

Adjusting for a matching confounder

Epidemiological literature usually gives formulas for matching which are based on the strong assumption that all sources of extraneous variation among a case and its controls are accounted for [ 1 , 8 , 9 ]. A third program DESIGNM was written to compute such a formula (from [ 1 ], p.294), but DESIGNM does not adjust for a confounding variable, and that strong assumption of implicit matching is rarely justified in case-control studies, so this program was not made freely available. Software which compute sample sizes for conditional logistic regression, such as EGRET SIZ[ 10 ], are alternatives to DESIGNM, which is based on Miettinen's test of the Mantel-Haenszel odds ratio for matched case-control designs. The adjustment in DAYSMITH and DESIGN is for stratified matching [ 2 , 11 , 12 ], where matching is by confounders. This presumes that the eventual analysis will be unconditional [ 2 ] and will account for the stratification. Consequently, it is not required that F controls be linked with each case, only that the total number of controls be F times the total number of cases.

Interaction

The literature [ 1 , 3 , 13 , 15 ] discusses stratified analysis interaction adjustment only for confounders with K = 2. It is easy, however, to modify the formulas for multi-level interaction. Every occurrence of R E in the formulas (Appendix 1) is replaced by R E R Ij , where R Ij is the interaction factor corresponding to the j th level, j = 2,..., K. (For ∑', put R Ij inside the first sum.) We set R I 1 = 1.

For two confounder levels, R I 2 , which is R I in Smith and Day's notation [ 3 ], is the multiplicative factor by which the odds ratio for those exposed and in level 2 of the confounder is different from the odds ratio when there is confounder-exposure-disease interaction. For R Ij , contrast is between level j and the reference level (level one).

This adjustment was made available for sensitivity analysis; specifically, to explore how much the sample size result could change if the confounder were in fact an effect modifier. Nevertheless, the adjusted formulas have been used to determine sample size in the presence of gene-environment interaction [ 13 ].

The competitors to these programs are regression-based sample size programs, such as those in EGRET SIZ [ 10 ], which compute sample sizes required for unconditional logistic regression. The package nQuery [ 14 ] has an unconditional logistic regression option, but is not set up for case-control designs. These may be useful for continuous exposures, and make sense when the final analysis is intended to be such a regression, rather than a stratified analysis, such as a Mantel-Haenszel test, which our programs correspond to. We are unaware of any generally available competitor for stratified analysis.

In a series of papers on sample-size estimation to detect gene-environment interaction, which is a controversial role for sample-size formulas, comparisons have been made between regression based approaches and the stratified analysis approach [ 13 , 15 ]. One solution is even to consider a case-only design [ 16 ]. EGRET SIZ provides no guidance for interaction adjustment, but it probably could be used for that purpose.

When there is more than one confounder, we define one super-confounder, where each category corresponds to a sub-category. For example, if age, with 5 categories, and smoking, with 2 categories, are both confounders, then we define one super-confounder with 10 = 5 × 2 categories. The estimates of r i , p i , and R Ci , i = 2, ...,10, then all have to take age and smoking into account jointly. As the number of confounders and the size of K increases, regression-based sample size programs become more advantageous, since information is not required for every sub-category.

The current programs yield results for 80% and 90% power, but versions are available for alternative powers, from 60% to 95%. A new version may print to the screen, if users want that option, and ask whether sample sizes for a specific power and Type I error are required.

The programs described are for two levels of disease (case vs. control) and of exposure. For several levels of exposure or disease, measures are available which correspond to odds ratios, risk ratios and risk differences [ 17 ], and it is not difficult to compute sample size formulas for these. If there is some demand, software to do those calculations may be created.

The Breslow-Day-Smith formulas which we extend utilize the classical method, based on testing. A more modern approach is that based on a confidence interval for the odds ratio [ 18 ], which may eventually become a program option. A Bayesian approach seems most suited for the sample size problem, although some issues need to be resolved [ 19 ]. Although not yet written, a Bayesian solution will soon be formulated for case-control designs.

Breslow NE, Day NE: Statistical Methods in Cancer Research, Vol. 2: The Design and Analysis of Cohort Studies, IARC Scientific Publications No. 82, International Agency of Research on Cancer, Lyon, France,. 1987, Sections 7.8-7.9: 305-306.

Google Scholar  

Smith PG, Day NE: Matching and confounding in the design and analysis of epidemiological case-control studies. Perspectives in Medical Statistics, J.F. Bithell, R. Coppi, eds. London: Academic Press,. 1987, 39-64.

Smith PG, Day NE: The design of case-control studies: the influence of confounding and interaction effects. International Journal of Epidemiology,. 1984, 13(3): 356-365.

Article   Google Scholar  

Fleiss JL: Statistical Methods for Rates and Proportions, 2nd Edition, Wiley: New York,. 1981

Schlesselman JJ: Case-Control Studies: design, conduct, analysis, Oxford University Press: New York,. 1982

Suissa S, Hemmelgarn B, Spitzer WO, Brophy J, Collet JP, Côté R, Downey W, Edouard L, LeClerc J, Paltiel O: The Saskatchewan oral contraceptive cohort study of oral contraceptive use and cardiovascular risks. Pharmacoepidemiology and Drug Safety,. 1993, 2: 33-49.

Spitzer WO, Thorogood M, Heinemann L: Tri-national case-control study of oral contraceptives and health. Pharmacoepidemiology and Drug Safety,. 1993, 2: 21-31.

Parker RA, Bregman DJ: Sample size for individually matched case-control studies. Biometrics,. 1986, 42: 919-926.

Article   CAS   Google Scholar  

Ejigou A: Power and sample size for matched case-control studies. Biometrics,. 1996, 52: 925-933.

EGRET. Cytel Software Corporation: Cambridge, MA,. 1997, (SIZ is a separate module)., [ http://www.cytel.com ]

Woolson RE, Bean JA, Rojas PB: Sample size for case-control studies using Cochran's statistic. Biometrics,. 1986, 42: 927-932.

Nam J: Sample size determination for case-control studies and the comparison of stratified and unstratified analyses. Biometrics,. 1992, 48: 389-395.

Hwang SJ, Beaty TH, Liang KY, Coresh J, Khoury MJ: Minimum sample size estimation to detect gene-environment interaction in case-control designs. American Journal of Epidemiology,. 1994, 140: 1029-1037.

CAS   Google Scholar  

Elashoff JD: nQuery Advisor relase 2.0. Statistical Solutions Ltd.: Cork, Ireland,. 1997, [ http://www.statsol.ie ]

Garcia-Closas M, Lubin JH: Power and sample size calculations in case-control studies of gene-environment interactions: comments on different approaches. American Journal of Epidemiology,. 1999, 149: 689-692.

Yang Q, Khoury MJ, Flanders WD: Sample size requirements in case-only designs to detect gene-environment interaction. American Journal of Epidemiology,. 1997, 146: 713-720.

Edwardes MD, Baltzan M: The generalization of the odds ratio, relative risk and risk difference to r × k tables. Statistics in Medicine,. 2000, 19: 1901-1914. 10.1002/1097-0258(20000730)19:14<1901::AID-SIM514>3.0.CO;2-V.

O'Neill RT: Sample sizes for estimation of the odds ratio in unmatched case-control studies. American Journal of Epidemiology,. 1984, 120: 145-153.

Joseph L, Du Berger R, Bélisle P: Bayesian and mixed Bayesian/likelihood criteria for sample size determination. Statistics in Medicine,. 1997, 16: 769-781. 10.1002/(SICI)1097-0258(19970415)16:7<769::AID-SIM495>3.0.CO;2-V.

Pre-publication history

The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1471-2288/1/11/prepub

Download references

Acknowledgement

The author is supported by an Équipe grant from the FRSQ (Fonds de la recherche en santé du Québec). I appreciate the input of Eric Johnson, Sholom Wacholder and Jesse Berlin.

Author information

Authors and affiliations.

Division of Clinical Epidemiology, Royal Victoria Hospital, Montreal, Quebec, Canada

Michael D Edwardes

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Michael D Edwardes .

Additional information

Competing interests.

none declared

Electronic supplementary material

12874_2001_11_moesm1_esm.pdf.

Appendix files: Appendix 1 - Shows the formulas utilized by DESIGN and DAYSMITH. Appendix 2 - Shows output from the DESIGN session described in the main text. (PDF 65 KB)

12874_2001_11_MOESM2_ESM.txt

Appendix files: Appendix 1 - Shows the formulas utilized by DESIGN and DAYSMITH. Appendix 2 - Shows output from the DESIGN session described in the main text. (TXT 3 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Edwardes, M.D. Sample size requirements for case-control study designs. BMC Med Res Methodol 1 , 11 (2001). https://doi.org/10.1186/1471-2288-1-11

Download citation

Received : 19 July 2001

Accepted : 19 November 2001

Published : 19 November 2001

DOI : https://doi.org/10.1186/1471-2288-1-11

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Reference Level
  • Unconditional Logistic Regression
  • Matched Design
  • Alternative Input

BMC Medical Research Methodology

ISSN: 1471-2288

sample size calculation for case study

  • Search Menu
  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Literature
  • Classical Reception
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Archaeology
  • Greek and Roman Papyrology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Agriculture
  • History of Education
  • History of Emotions
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Acquisition
  • Language Variation
  • Language Families
  • Language Evolution
  • Language Reference
  • Lexicography
  • Linguistic Theories
  • Linguistic Typology
  • Linguistic Anthropology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Modernism)
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Religion
  • Music and Culture
  • Music and Media
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Science
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Lifestyle, Home, and Garden
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Politics
  • Law and Society
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Oncology
  • Medical Toxicology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Clinical Neuroscience
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Medical Ethics
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Security
  • Computer Games
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Neuroscience
  • Cognitive Psychology
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Strategy
  • Business History
  • Business Ethics
  • Business and Government
  • Business and Technology
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic Systems
  • Economic Methodology
  • Economic History
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Theory
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Politics and Law
  • Public Administration
  • Public Policy
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Developmental and Physical Disabilities Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

Critical Thinking in Clinical Research: Applied Theory and Practice Using Case Studies (1)

  • < Previous chapter
  • Next chapter >

Critical Thinking in Clinical Research: Applied Theory and Practice Using Case Studies (1)

11 Sample Size Calculation

  • Published: March 2018
  • Cite Icon Cite
  • Permissions Icon Permissions

In this chapter the basic principles of sample size calculation are discussed. The chapter also reviews the impact of sample size calculation on the study results, the parameters needed, and ways this calculation can be performed by researchers. Over- and underestimation of sample size for any study can have significant effects for the study participants, thus ensuring its adequacy is of critical importance. Setting values for alpha (level of significance) and beta (power) should be informed by the specific research goals and study hypothesis. A priori effect size estimation is challenging and can be done in various ways, which are addressed in this chapter. The chapter concludes with examples and references of sources that can be used for sample size calculation.

Signed in as

Institutional accounts.

  • GoogleCrawler [DO NOT DELETE]
  • Google Scholar Indexing

Personal account

  • Sign in with email/username & password
  • Get email alerts
  • Save searches
  • Purchase content
  • Activate your purchase/trial code

Institutional access

  • Sign in with a library card Sign in with username/password Recommend to your librarian
  • Institutional account management
  • Get help with access

Access to content on Oxford Academic is often provided through institutional subscriptions and purchases. If you are a member of an institution with an active account, you may be able to access content in one of the following ways:

IP based access

Typically, access is provided across an institutional network to a range of IP addresses. This authentication occurs automatically, and it is not possible to sign out of an IP authenticated account.

Sign in through your institution

Choose this option to get remote access when outside your institution. Shibboleth/Open Athens technology is used to provide single sign-on between your institution’s website and Oxford Academic.

  • Click Sign in through your institution.
  • Select your institution from the list provided, which will take you to your institution's website to sign in.
  • When on the institution site, please use the credentials provided by your institution. Do not use an Oxford Academic personal account.
  • Following successful sign in, you will be returned to Oxford Academic.

If your institution is not listed or you cannot sign in to your institution’s website, please contact your librarian or administrator.

Sign in with a library card

Enter your library card number to sign in. If you cannot sign in, please contact your librarian.

Society Members

Society member access to a journal is achieved in one of the following ways:

Sign in through society site

Many societies offer single sign-on between the society website and Oxford Academic. If you see ‘Sign in through society site’ in the sign in pane within a journal:

  • Click Sign in through society site.
  • When on the society site, please use the credentials provided by that society. Do not use an Oxford Academic personal account.

If you do not have a society account or have forgotten your username or password, please contact your society.

Sign in using a personal account

Some societies use Oxford Academic personal accounts to provide access to their members. See below.

A personal account can be used to get email alerts, save searches, purchase content, and activate subscriptions.

Some societies use Oxford Academic personal accounts to provide access to their members.

Viewing your signed in accounts

Click the account icon in the top right to:

  • View your signed in personal account and access account management features.
  • View the institutional accounts that are providing access.

Signed in but can't access content

Oxford Academic is home to a wide variety of products. The institutional subscription may not cover the content that you are trying to access. If you believe you should have access to that content, please contact your librarian.

For librarians and administrators, your personal account also provides access to institutional account management. Here you will find options to view and activate subscriptions, manage institutional settings and access options, access usage statistics, and more.

Our books are available by subscription or purchase to libraries and institutions.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Rights and permissions
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Sample size calculations for case-control studies

This R package can be used to calculate the required samples size for unconditional multivariate analyses of  unmatched case-control studies.  The sample sizes are for a scalar exposure effect, such as binary, ordinal or continuous exposures.  The sample sizes can also be computed for scalar interaction effects.  The analyses account for the effects of potential confounder variables that are also included in the multivariate logistic model.

  • License Agreement
  • samplesizelogisticcasecontrol (Link to CRAN)
  • Gail MH, Haneuse S. Power and sample size for multivariate logistic modeling of unmatched case-control studies . Stat Methods Med Res 2017 Jan 1:962280217737157. doi: 10.1177/0962280217737157. 

StatCalc: Statistical Calculators

  • Introduction
  • Tables (2 x 2, 2 x n)
  • Stratified Analysis of 2 x 2 Tables
  • Population Survey or Descriptive Study
  • Cohort and Cross-Sectional

Unmatched Case-Control

  • Chi Square for Trend
  • Matched Pair Case-Control
  • Visual Dashboard
  • OpenEpi.com

The Unmatched Case-Control study calculates the sample size recommended for a study given a set of parameters and the desired confidence level.

The following example demonstrates how to calculate a sample size for an unmatched case- control study. The application will show three different sample size estimates according to three different statistical calculations.

  • From the Epi Info™ main page, select StatCalc.
  • Select Unmatched Case-Control. The Unmatched Case-Control window opens.
  • Select the Two-sided confidence level of 95% from the drop-down list.
  • Enter the desired Power (80%) to detect a group difference at that confidence level.
  • Enter the ratio of controls to cases as 3. This is a single value and the proportion cannot be entered in the format, # of Unexposed : # of Exposed.
  • Enter the percentage outcome in the unexposed group 75%. This percentage represents the number of ill patients in the unexposed group.
  • Enter the percentage outcome in the exposed group 25%. This percentage represents the number of ill patients in the exposed group.
  • The Odds ratio automatically populates based on the values entered.
  • The output table shows three different estimates of sample size needed.

StatCalc showing an unmatched case-controls study.

Figure 10.12 Unmatched Case-Controls Study

  • Epi Info™ 7 Tutorial Videos
  • Epi Info™ 7 User Guide
  • Epi Info™ Community Questions & Answers external
  • ActivEpi Web - explains concepts and methods external
  • OpenEpi - makes computations external

Exit Notification / Disclaimer Policy

  • The Centers for Disease Control and Prevention (CDC) cannot attest to the accuracy of a non-federal website.
  • Linking to a non-federal website does not constitute an endorsement by CDC or any of its employees of the sponsors or the information and products presented on the website.
  • You will be subject to the destination website's privacy policy when you follow the link.
  • CDC is not responsible for Section 508 compliance (accessibility) on other federal or private website.

Sample Size Calculator

Find out the sample size.

This calculator computes the minimum number of necessary samples to meet the desired statistical constraints.

Find Out the Margin of Error

This calculator gives out the margin of error or confidence interval of observation or survey.

Related Standard Deviation Calculator | Probability Calculator

In statistics, information is often inferred about a population by studying a finite number of individuals from that population, i.e. the population is sampled, and it is assumed that characteristics of the sample are representative of the overall population. For the following, it is assumed that there is a population of individuals where some proportion, p , of the population is distinguishable from the other 1-p in some way; e.g., p may be the proportion of individuals who have brown hair, while the remaining 1-p have black, blond, red, etc. Thus, to estimate p in the population, a sample of n individuals could be taken from the population, and the sample proportion, p̂ , calculated for sampled individuals who have brown hair. Unfortunately, unless the full population is sampled, the estimate p̂ most likely won't equal the true value p , since p̂ suffers from sampling noise, i.e. it depends on the particular individuals that were sampled. However, sampling statistics can be used to calculate what are called confidence intervals, which are an indication of how close the estimate p̂ is to the true value p .

Statistics of a Random Sample

The uncertainty in a given random sample (namely that is expected that the proportion estimate, p̂ , is a good, but not perfect, approximation for the true proportion p ) can be summarized by saying that the estimate p̂ is normally distributed with mean p and variance p(1-p)/n . For an explanation of why the sample estimate is normally distributed, study the Central Limit Theorem . As defined below, confidence level, confidence intervals, and sample sizes are all calculated with respect to this sampling distribution. In short, the confidence interval gives an interval around p in which an estimate p̂ is "likely" to be. The confidence level gives just how "likely" this is – e.g., a 95% confidence level indicates that it is expected that an estimate p̂ lies in the confidence interval for 95% of the random samples that could be taken. The confidence interval depends on the sample size, n (the variance of the sample distribution is inversely proportional to n , meaning that the estimate gets closer to the true proportion as n increases); thus, an acceptable error rate in the estimate can also be set, called the margin of error, ε , and solved for the sample size required for the chosen confidence interval to be smaller than e ; a calculation known as "sample size calculation."

Confidence Level

The confidence level is a measure of certainty regarding how accurately a sample reflects the population being studied within a chosen confidence interval. The most commonly used confidence levels are 90%, 95%, and 99%, which each have their own corresponding z-scores (which can be found using an equation or widely available tables like the one provided below) based on the chosen confidence level. Note that using z-scores assumes that the sampling distribution is normally distributed, as described above in "Statistics of a Random Sample." Given that an experiment or survey is repeated many times, the confidence level essentially indicates the percentage of the time that the resulting interval found from repeated tests will contain the true result.

Confidence Interval

In statistics, a confidence interval is an estimated range of likely values for a population parameter, for example, 40 ± 2 or 40 ± 5%. Taking the commonly used 95% confidence level as an example, if the same population were sampled multiple times, and interval estimates made on each occasion, in approximately 95% of the cases, the true population parameter would be contained within the interval. Note that the 95% probability refers to the reliability of the estimation procedure and not to a specific interval. Once an interval is calculated, it either contains or does not contain the population parameter of interest. Some factors that affect the width of a confidence interval include: size of the sample, confidence level, and variability within the sample.

There are different equations that can be used to calculate confidence intervals depending on factors such as whether the standard deviation is known or smaller samples (n<30) are involved, among others. The calculator provided on this page calculates the confidence interval for a proportion and uses the following equations:

confidence interval equations

Within statistics, a population is a set of events or elements that have some relevance regarding a given question or experiment. It can refer to an existing group of objects, systems, or even a hypothetical group of objects. Most commonly, however, population is used to refer to a group of people, whether they are the number of employees in a company, number of people within a certain age group of some geographic area, or number of students in a university's library at any given time.

It is important to note that the equation needs to be adjusted when considering a finite population, as shown above. The (N-n)/(N-1) term in the finite population equation is referred to as the finite population correction factor, and is necessary because it cannot be assumed that all individuals in a sample are independent. For example, if the study population involves 10 people in a room with ages ranging from 1 to 100, and one of those chosen has an age of 100, the next person chosen is more likely to have a lower age. The finite population correction factor accounts for factors such as these. Refer below for an example of calculating a confidence interval with an unlimited population.

EX: Given that 120 people work at Company Q, 85 of which drink coffee daily, find the 99% confidence interval of the true proportion of people who drink coffee at Company Q on a daily basis.

confidence interval example

Sample Size Calculation

Sample size is a statistical concept that involves determining the number of observations or replicates (the repetition of an experimental condition used to estimate the variability of a phenomenon) that should be included in a statistical sample. It is an important aspect of any empirical study requiring that inferences be made about a population based on a sample. Essentially, sample sizes are used to represent parts of a population chosen for any given survey or experiment. To carry out this calculation, set the margin of error, ε , or the maximum distance desired for the sample estimate to deviate from the true value. To do this, use the confidence interval equation above, but set the term to the right of the ± sign equal to the margin of error, and solve for the resulting equation for sample size, n . The equation for calculating sample size is shown below.

sample size equations

EX: Determine the sample size necessary to estimate the proportion of people shopping at a supermarket in the U.S. that identify as vegan with 95% confidence, and a margin of error of 5%. Assume a population proportion of 0.5, and unlimited population size. Remember that z for a 95% confidence level is 1.96. Refer to the table provided in the confidence level section for z scores of a range of confidence levels.

sample size example

Thus, for the case above, a sample size of at least 385 people would be necessary. In the above example, some studies estimate that approximately 6% of the U.S. population identify as vegan, so rather than assuming 0.5 for p̂ , 0.06 would be used. If it was known that 40 out of 500 people that entered a particular supermarket on a given day were vegan, p̂ would then be 0.08.

English

Sample size for a case-control study

Sample size requirements for case-control study designs

Affiliation.

  • 1 Division of Clinical Epidemiology, Royal Victoria Hospital, Montreal, Quebec, Canada. [email protected]
  • PMID: 11747473
  • PMCID: PMC61037
  • DOI: 10.1186/1471-2288-1-11

Background: Published formulas for case-control designs provide sample sizes required to determine that a given disease-exposure odds ratio is significantly different from one, adjusting for a potential confounder and possible interaction.

Results: The formulas are extended from one control per case to F controls per case and adjusted for a potential multi-category confounder in unmatched or matched designs. Interactive FORTRAN programs are described which compute the formulas. The effect of potential disease-exposure-confounder interaction may be explored.

Conclusions: Software is now available for computing adjusted sample sizes for case-control designs.

Publication types

  • Research Support, Non-U.S. Gov't
  • Case-Control Studies*
  • Models, Statistical
  • Research Design / statistics & numerical data*
  • Sample Size
  • Software / statistics & numerical data

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Gastroenterol Hepatol Bed Bench
  • v.6(1); Winter 2013

Sample size calculation in medical studies

Mohamad amin pourhoseingholi.

1 Department of Biostatistics, Shahid Beheshti University of Medical Sciences, Tehran, Iran

Mohsen Vahedi

2 Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran

Mitra Rahimzadeh

3 Alborz University of Medical Sciences, Karaj, Iran

Optimum sample size is an essential component of any research. The main purpose of the sample size calculation is to determine the number of samples needed to detect significant changes in clinical parameters, treatment effects or associations after data gathering. It is not uncommon for studies to be underpowered and thereby fail to detect the existing treatment effects due to inadequate sample size. In this paper, we explain briefly the basic principles of sample size calculations in medical studies.

Introduction

Sample size calculations or sample size justifications is one of the first steps in designing a clinical study. The sample size is the number of patients or other investigated units that will be included in a study and required to answer the research hypothesis in the study. The main purpose of the sample size calculation is to determine the enough number of units needed to detect the unknown clinical parameters or the treatment effects or the association after data gathering.

If the sample size is too small, the investigator may not be able to answer the study question. On the other hand, the number of patients in many studies is limited due to practicalities such as cost, patient inconvenience, decisions not to proceed with an investigation or a prolonged study time. Investigators should calculate the optimum sample size before data gathering to avoid the mistakes because of too small sample size and also wasting money and time, because of too large sample size. Besides, sample size calculations for research projects are an essential part of a study protocol for submission to ethical committees or for some peer review journals ( 1 ). It is very important to determine the sample size according to the study design and the objectives of the study. Making mistakes in the calculation of the size of sample can lead to incorrect or insignificant results ( 2 ). In this paper, we explain briefly the basic principles of sample size calculations in medical studies.

Assumptions for sample size calculation

There are some assumptions in order to calculate the sample size including variability, type I and type II errors and the smallest effect of interest.

Outcome's variability

The variability in the outcome variable is the population variance of a given outcome that is estimated by the standard deviation. Investigators can use an estimate obtained from a pilot study or the reported variation the previously studies.

The type I and type II errors

The type I error is the rejection of a true null hypothesis and type II error is the failure to reject a false null hypothesis. In other meaning, a type I error is corresponding to the level of confidence in sample size calculation, which is the degree of uncertainty or probability that a sample value lies outside a stated limits ( 2 ) and type II error is in corresponding to power, which means the ability of a statistical test to reject the false null hypothesis. Power analysis can be used to calculate the minimum sample size so that investigator can detect an effect of a given size.

Effect size

The effect size is the minimal difference between the studied groups that the investigator wishes to detect or the difference between estimation and unknown parameter which investigator wants to estimate. Therefore, one can makes a statement that it does not matter how much the sample estimation differs from true population value by a certain amount. This amount is called minimum effect size.

Sample size calculation in cross-sectional studies

In cross-sectional studies the aim is to estimate the prevalence of unknown parameter(s) from the target population using a random sample. So an adequate sample size is needed to estimate the population prevalence with a good precision.

To calculate this adequate sample size there is a simple formula, however it needs some practical issues in selecting values for the assumptions required in the formula too and in some situations, the decision to select the appropriate values for these assumptions are not simple ( 3 ). The following simple formula would be used for calculating the adequate sample size in prevalence study ( 4 ); n = Z 2 P ( 1 - P ) d 2 Where n is the sample size, Z is the statistic corresponding to level of confidence, P is expected prevalence (that can be obtained from same studies or a pilot study conducted by the researchers), and d is precision (corresponding to effect size).

The level of confidence usually aimed for is 95%, most researchers present their results with a 95% confidence interval (CI). However, some researchers wants to be more confident can chose a 99% confidence interval.

Researcher needs to know the assumed P in order to use in formula. This can be estimated from previous studies published in the study domain or conduct a pilot study with small sample to estimate the assumed P value. This assumed P is a very important issue because the precision (d) should be selected according to the amount of P. There is not enough guideline for choosing appropriate d. Some authors recommended to select a precision of 5% if the prevalence of the disease is going to be between 10% and 90%, However, when the assumed prevalence is too small (going to be below 10%), the precision of 5% seems to be inappropriate. For example, if the assumed prevalence is 1% the precision of 5% is obviously crude and it may cause inappropriate sample size ( 3 ). A conservative choice would be one-fourth or one-fifth of prevalence as the amount of precision in the case of small P. In Table 1 , we presented sample size calculation for three different P and three different precisions. For P = 0.05, the appropriate precision is 0.01 which resulted to 1825 samples. For P = 0.2, the best precision would be 0.04 and when P increases to 0.6, the precision could increases up to 0.1 (or more), yields to 92 samples. The investigators should notice to the appropriate precision according to assumed P. The wrong precision yields to wrong sample size (too small or too large).

Sample size to Estimate Prevalence with different Precision and 95% of confidence

Sample size calculation in case-control studies

The case-control is a type of epidemiological observational study. It is often used to identify risk factors that may associated to a disease by comparing the risk factors in subjects who have that disease (the cases) with subjects who do not have the disease (the controls).

The sample size calculation for unmatched case control studies (the number of cases and controls) needs these assumptions; the assumed number of cases and controls who experienced the risk factors from similar studies or from a pilot study (also researchers can use the assumed odds ratio; OR), the level of confidence (almost 95%) and the proposed power of the study (would be from 80%). There are software or guide books that provide the investigators with the formula or the sample size calculated in tables according to different assumptions ( 5 ). But researchers should remember that, in the presence of a significant confounding factor ( 6 ), researchers require a larger sample size. Since the confounding variables must be controlled for in any analysis, a more complex statistical model must be made, so a larger sample is required to achieve significance.

Sample size in clinical trials

In a clinical trial, if the sample size is too small, a well conducted study may fail to answer its research hypothesis or may fail to detect important effects and associations ( 6 ). The minimum information needed to calculate sample size for a randomized controlled trial includes the power, the level of significance, the underlying event rate in the population and the size of the treatment effect sought. Besides this, the calculated sample size should be adjusted for other factors including expected compliance rates and, less commonly, an unequal allocation ratio ( 7 ).

There are some recommendations for different phases of clinical trials based on their sample size; in phase I trial that involve drug safety on human volunteers. Initial trials might require a total of around 20-80 patients. In phase II trials that investigate the treatment effects, seldom require more than 100-200 patients ( 8 ).

Optimum sample size is an essential component of any research ( 9 ). It is not uncommon for studies to be underpowered and fail to detect treatment effects due to inadequate sample size ( 10 ). The calculation of adequate sample size is an important part of any clinical studies and a professional statistician is the best person to ask for help at the time of planning a research project ( 6 ). However, researchers must provide the necessary information in order that the sample size can be determined according to correct assumptions ( 1 ). There are many statistical books provided the methods for sample size calculation in medical studies ( 5 ) and also several software programs available to help with sample size calculations ( 11 ), or online software in the internet. While these programs are user-friendly, researchers should consult an experienced statistician at the design stages of their projects to avoid methodological errors.

( Please cite as: Pourhoseingholi MA, Vahedi M, Rahimzadeh M. Sample size calculation in medical studies. Gastroenterol Hepatol Bed Bench 2013;6(1):14-17).

IMAGES

  1. Sample size calculation of case control study

    sample size calculation for case study

  2. How To Calculate Sample Size In Case Control Study

    sample size calculation for case study

  3. PPT

    sample size calculation for case study

  4. Sample size calculation for Cohort study using MS Excel

    sample size calculation for case study

  5. Sample Size Calculation Made Easy_Case Control Study

    sample size calculation for case study

  6. Calculation Of Sample Size In Case Control Study

    sample size calculation for case study

VIDEO

  1. Round 4 (Sample size calculation and factors affecting sample size) BMSP 41

  2. Sample Size Calculation in Marketing Research

  3. Sample size calculation & test of statisticalsignificance

  4. Excel- sample size calculation

  5. Step 3-5: Sample Size Calculation in Experimental Research

  6. How to Calculate, Decide Sample Size

COMMENTS

  1. Sample size calculator

    Sample Size Calculator Sample Size Estimation in Clinical Research: from Randomized Controlled Trials to Observational Studies Introduction ... A case-control study of the relationship between smoking and CHD is planned. A sample of men with newly diagnosed CHD will be compared for smoking status with a sample of controls.

  2. How to Calculate Sample Size for Different Study Designs in Medical

    In this method a value E is calculated based on decided sample size. The value if E should lies within 10 to 20 for optimum sample size. If a value of E is less than 10 then more animal should be included and if it is more than 20 then sample size should be decreased. E = Total number of animals - Total number of groups.

  3. Sample Size Calculation Guide

    In the previous educational articles, we explained how to calculate the sample size for a rate or a single proportion and how to calculate the sample size for an independent cohort study (1, 2). In this article, we will explain how to calculate the sample size for an independent case-control study based on the odds ratios or two proportions ...

  4. PDF Sample Size Formulas for Different Study Designs

    2.2 Case-Control study - Matched The matched case-control study design has been commonly applied in public health research. Matching of cases and controls is employed to control the effects of known potential confounding variables. The sample size formula was developed by Dupont 11. To compute the sample size, we need to provide V, ^, > #, >

  5. Sample size determination: A practical guide for health researchers

    PS Power and Sample Size Calculation 15 or Sample Size Calculator 16 are practical tools for power and sample size calculations in studies with dichotomous, continuous, or survival outcome measures. The support offered by these tools varies in terms of the type of interface and the mathematical formula or assumptions used for calculation. 17 - 20

  6. Sample Size Calculator

    Sample Size Calculator Determines the minimum number of subjects for adequate study power ClinCalc.com » Statistics » Sample Size Calculator. Study Group Design vs. ... Two Independent Sample Study. Sample Size; Group 1: 690: Group 2: 690: Total: 1380: Study Parameters; Incidence, group 1: 35%: Incidence, group 2: 28%: Alpha: 0.05:

  7. Sample-size Formula for Case-cohort Studies : Epidemiology

    We show a simple sample size formula for the case-cohort design interpretable as the straightforward expansion of the conventional sample-size formula for the cohort study. Nfull denotes the sample size needed for the cohort study and N1full ( N0full) is the size of the exposed (unexposed) population in the full cohort, that is, Nfull = (1 + K ...

  8. OpenEpi

    This module calculates sample size for an unmatched case-control study. You enter the desired confidence level, power, a hypothetical percentage of exposure among the controls, and either an odds ratio or a hypothetical percentage of exposure among the cases. Results are presented using methods of Kelsey, Fleiss, and Fleiss with a continuity ...

  9. Sample size determination: A practical guide for health researchers

    Approaches to sample size calculation according to study design are presented with examples in health research. For sample size estimation, researchers need to (1) provide information regarding the statistical analysis to be applied, (2) determine acceptable precision levels, (3) decide on study power, (4) specify the confidence level, and (5 ...

  10. Sample size requirements for case-control study designs

    Woolson RE, Bean JA, Rojas PB: Sample size for case-control studies using Cochran's statistic. Biometrics,. 1986, 42: 927-932. ... Power and sample size calculations in case-control studies of gene-environment interactions: comments on different approaches. American Journal of Epidemiology,. 1999, 149: 689-692.

  11. Sample Size Calculation Guide

    Requirements for sample size calculation based on the prevalence rate. 1) Population size: ... Therefore, the sample size required for this study will be 266 patients. In case of using a clustered sampling method, the number of clusters will be submitted instead of the "1" cluster. Open in a separate window. Figure 2.

  12. 11 Sample Size Calculation

    Expand Case Study 13: Calculating the Sample Size for a Lyme Disease Trial Case Study 13: ... The chapter also reviews the impact of sample size calculation on the study results, the parameters needed, and ways this calculation can be performed by researchers. Over- and underestimation of sample size for any study can have significant effects ...

  13. Sample size determination: A practical guide for health researchers

    Approaches to sample size calculation according to study design are presented with examples in health research. For sample size estima - tion, researchers need to (1) provide information regarding the statistical analysis to ... other hypothesis."32 Such studies include case reports, case series, and cross- sectional (prevalence) studies. 33 ...

  14. Sample size calculations for case-control studies

    Sample size calculations for case-control studies. This R package can be used to calculate the required samples size for unconditional multivariate analyses of unmatched case-control studies. The sample sizes are for a scalar exposure effect, such as binary, ordinal or continuous exposures. The sample sizes can also be computed for scalar ...

  15. Unmatched Case-Control

    The Unmatched Case-Control study calculates the sample size recommended for a study given a set of parameters and the desired confidence level. Example. The following example demonstrates how to calculate a sample size for an unmatched case- control study. The application will show three different sample size estimates according to three ...

  16. Sample Size Calculator

    This free sample size calculator determines the sample size required to meet a given set of constraints. Also, learn more about population standard deviation. ... For example, if the study population involves 10 people in a room with ages ranging from 1 to 100, and one of those chosen has an age of 100, the next person chosen is more likely to ...

  17. PDF Sample Size for an Unmatched Case-Control Study

    This module calculates a sample size for an unmatched case-control study. The data input screen is as follows: The four values required for a sample size calculation are: Two-sided confidence level - most individuals would choose a 95% confidence interval, but a different confidence interval could be entered. ...

  18. Sample Size Calculation Guide

    Introduction. In the previous educational articles, we explained how to calculate the sample size for a rate or a single proportion, for an independent cohort study, for an independent case-control study, for a diagnostic test accuracy study, for a superiority clinical trial, and for a non-inferiority or equivalence clinical trial (1-6).In this article, we will explain how to calculate the ...

  19. Sample size/power calculation for case-cohort studies

    In this article, we describe two tests for the case-cohort design, which can be treated as a natural generalization of log-rank test in the full cohort design. We derive an explicit form for power/sample size calculation based on these two tests. A number of simulation studies have been used to illustrate the efficiency of the tests for the ...

  20. Epitools

    Sample size for a case-control study. This utility calculates the sample size required for a case-control study, with specified levels of confidence and power and case and control groups of equal size. Inputs are the expected proportion exposed in the controls, the assumed odds ratio, and the desired level of confidence and power for the ...

  21. A Step-by-Step Process on Sample Size Determination for Medical

    Introduction. Sample size calculation or estimation is an important consideration which necessitate all researchers to pay close attention to when planning a study, which has also become a compulsory consideration for all experimental studies ().Moreover, nowadays, the selection of an appropriate sample size is also drawing much attention from researchers who are involved in observational ...

  22. Sample size requirements for case-control study designs

    Abstract. Background: Published formulas for case-control designs provide sample sizes required to determine that a given disease-exposure odds ratio is significantly different from one, adjusting for a potential confounder and possible interaction. Results: The formulas are extended from one control per case to F controls per case and adjusted ...

  23. Sample size calculation in medical studies

    The sample size calculation for unmatched case control studies (the number of cases and controls) needs these assumptions; the assumed number of cases and controls who experienced the risk factors from similar studies or from a pilot study (also researchers can use the assumed odds ratio; OR), the level of confidence (almost 95%) and the ...