Using Sample Size Software for Teaching Study Design Concepts

Jim Dignam, PhD
Jun 5, 2024
3 min read

Resource Review Post by Jim Dignam, PhD, University of Chicago.

For those teaching statistical methods with a health science focus, often to an audience with moderate technical background, illustrative tools can be highly effective in communicating complex topics. For example, a course addressing clinical research will invariably involve coverage of key study design elements, including sample size, statistical power, error control, and effect size specification. The interaction of these is critical, and usually involves ‘extra-statistical’ considerations that bear on the complexities of studies involving volunteer human participants.

Sometimes, sample size for clinical studies comparing means or simple proportions may essentially follow elementary principles. However, the calculations quickly become more complicated when, for example, one wants to build in the opportunity to stop the study early either for harm, lack of any benefit, or for extraordinary benefit. Methods to support the necessary calculations are necessarily more complex, but there are many software implementations available to carry out the work. In clinical trial design, often one uses computer programs iteratively to evaluate how different choices play out to arrive at a feasible study. When teaching the basics of clinical trial design and covering these critical concepts, use of software similarly allows hands-on demonstration of what is otherwise somewhat obscure via the formulas. One such tool that is freely available resides at the Cancer Research and Biostatistics (CRAB, a contract research organization that mostly provides support for public funded research https://www.crab.org/ ) under the Statistical Tools tab.

The Statistical Tools page provides calculations for a variety of study types, ranging from simple binomial and normally distributed outcome measures to multi-stage designs and survival (time to event) studies. All methods are menu-driven and provide succinct summaries.

As an illustration of how the website might be used, we can consider two-arm trial with a survival time endpoint. Suppose we are asked to determine the number of patients needed to reliably show that a new treatment can improve 5-year survival from 60% to 75%. Here, the most common metric is in the form of a failure rate ratio, and the sample size for the test is governed by the number of events required for a suitably accurate estimate of this quantity. However, this is only part of the story for the actual study sample size. The number of study participants will depend on these failure rates but in addition the pace at which individuals can be enrolled and come under observation, the duration of follow-up that can be supported to complete the study (i.e., observe the needed events), and consideration of what happens when these projections turn out not to be accurate.

So, there are a lot of ‘moving parts’ in the actual calculation of the number needed, and some tradeoffs that may not be easy to see except through examining different scenarios. For example, if accrual can occur more rapidly, then for a fixed follow-up time, fewer patients will be needed. On the other hand, if accrual is slower and the maximum follow-up time is fixed, then loss of power will result (not enough events will be observed), but power can be restored if longer follow-up can be afforded. The CRAB website will allow varying parameters such as accrual rate, accrual period, and follow-up period to evaluate which will still result in an adequately powered study. There is no simple formula that can illustrate all these parameter combinations, and so a bit of experimentation is necessary to sketch out the possibilities.

A similar exercise can be done to examine the properties of multi-stage binomial designs. One can begin with the simple one-arm binomial as a benchmark sample size. Then, the commonly used two-stage design can be explored, whereby one enrolls a fixed number of patients, evaluates the outcome, and if a minimum bar of responses is not met, the study stops. Otherwise, an additional second stage cohort is enrolled, and the estimate of response is ultimately based on all patients (with appropriate multiplicity control as part of the algorithm). Again, there are choices in terms of how aggressively to curtail the study in the first stage, and whether one might also consider early efficacy stopping, and these can be explored via runs of the program.

There are certainly more elegant packages (some commercial) that can perform these calculations, but the CRAB website provides a universally accessible and reliable means to allow students to explore clinical trial design. Of course, you may also find it a welcome addition for your work in study design and related statistical calculations!

Finding Data to Enhance Your Teaching

TSHS at JSM 2025

TSHS 2025 Incoming Chair Message

Using Sample Size Software for Teaching Study Design Concepts

Comments