Simon’s two-stage designs are frequently used in phase II single-arm trials for efficacy studies. A concern of safety studies is too many patients who experience an adverse event. We show that Simon’s two-stage designs for efficacy studies can be similarly used to design a two-stage safety study by modifying some of the design parameters. Given the type I and II error rates and the proportion of adverse events experienced in the first stage cohort, we prescribe a procedure whether to terminate the trial or proceed with a stage 2 trial by recruiting additional patients. We study the relationship between a two-stage design with a safety endpoint and an efficacy endpoint as well as use simulation studies to ascertain their properties. We provide a real-life application and a free R package gen2stage to facilitate direct use of two-stage designs in a safety study.
Phase II single-arm two-stage designs are typically used to determine if a drug is sufficiently efficacious to move on a randomized phase III trial. Sometimes, we would like to design a trial to examine whether the drug is safe due to a significant adverse event or toxicity. These are often called toxicity or tolerability studies instead of safety studies and they occur frequently. Results from a sample of subjects are then compared to an adverse event rate from historical controls. If the drug appears to have a fewer number of patients with a toxic response than expected, the trial proceeds to stage 2 for further research; otherwise, the study terminates at stage 1.
Drug trials estimate the true success rate
These optimization problems are typically solved by a greedy search over the set of constrained positive integers (Kim and Wong, 2017). There are two optimality criteria for Simon’s two-stage designs: a minimax criterion that minimizes the maximum sample size and an optimal design that minimizes the expected sample (Simon, 1989). Since Simon’s landmark paper was proposed, many variations of the strategies have been proposed for phase II designs. Green and Dahlberg (1992) investigated a two-stage design for multicenter trials when the attained sample size is not the planned one. Mander and Thompson (2010) and Mander
Several phase II single-arm two-stage designs are available for toxicity studies, but those studies are designed to monitor toxicity as a secondary or a co-primary endpoint to determine if a drug or treatment is efficacious; see, Bryant and Day (1995), Conaway and Petroni (1996), Ray and Rai (2011), and references. Therefore, those designs are not directly applicable for safety studies where the primary endpoint is the proportion of adverse events or toxicity. The motivation for this work is that there are no phase II single-arm safety studies that utilize Simon’s two-stage designs. There are two possible reasons for this: (i) there are allusions that this is possible but the theoretical justifications have not been worked out, and (ii) there is no software package to generate a two-stage design for a safety study. PASS 15 (NCSS, LLC), a widely used power and sample size calculation commercial software package, has an option for Simon’s two-stage designs under the ‘Proportions: One Proportion: Group-Sequential: Two-Stage Phase II Clinical Trials’ category, but cannot handle safety cases where a null proportion
As an example of a safety study, Rugo
We now develop a theory to construct a two-stage safety trial following Simon’s original two-stage design. In particular, we provide analytical formulas parallel to those for an efficacy study and show how a two-stage safety study can be found directly from Simon’s two-stage design. We also compare single-stage and two-stage designs for safety studies through simulation studies and apply our results to construct a phase II two-stage design for a real application (Rugo
Throughout, let
A single-stage design is estimated using Fleming’s single-stage procedure with the exact binomial distribution (Fleming, 1982; A’Hern, 2001). The true rate (or proportion) is denoted by
Begin by recruiting
When
When
The single-stage design has two parameters
where
The optimal choice of
where
The two-stage design evaluates the trial endpoint at each stage and allows the trial to proceed to the second stage only if one rejects the null hypothesis at the end of the first stage. The required sample sizes for the first and second stages are denoted by
Step I: Begin by recruiting
Step II:
When
When
The two-stage designs for efficacy or safety have four parameters
where
where the subscripts
The sought two-stage design for evaluating toxicity is to find a design
The goodness of
where
We describe several relationships between the designs for efficacy,
We now discuss several useful relationships between decision rules to test toxicity and efficacy rates with justifications.
By Theorem 1, finding the set of feasible solutions Θ̂ that satisfies
Corollary 1 implies that finding the optimal single-stage design for
Here is an example for a single-stage design for testing toxicity.
Consider a standard therapy (historical control) whose incidence rate of adverse events is 0.5 (i.e.,
Using Theorem 1, calculations of the PET and the expected sample size for toxicity studies are similar to those for efficacy studies. It can be shown that for a two-stage design with
Since
However, by Theorem 2, finding the set of feasible solutions Θ̂ that satisfies
When
It follows from Theorem 2 and Corollary 2. The two-stage design for testing toxicity when
Here is an example of a two-stage design for testing toxicity.
Consider the same design parameter values in Example 1 for the single-stage design, (
We use simulation studies to show optimal single- and two-stage designs for toxicity studies using various design parameters where
Table 1 shows single-stage and two-stage designs when
Table 2 displays the simulated single-stage and two-stage designs when
We revisit the stomatitis study conducted by Rugo
Suppose
The primary objective of a phase II single-arm study is often to assess safety and/or tolerability of a certain drug or treatment by the incidence of adverse events or toxicity. A phase II single-arm safety study aims to show that the rate of an adverse event is lower in the experimental therapy than that in the historical control (i.e.,
Two-stage designs are widely used in phase II single-arm efficacy studies because of the flexibility to stop early due to futility and avoid the unnecessary exposure of patients to ineffective therapies. However, there appears to be no phase II single-arm safety studies that employ two-stage designs. This means that current safety studies have no opportunity to stop a trial early due to futility. One possible explanation may be a lack of theoretical justification and dedicated software. Our work shows that the traditional Simon’s two-stage designs to evaluate efficacy in a single one-arm trial can also be used for a one or two-stage safety trial with
Jung
It is noteworthy that the Bayesian optimal phase II (BOP2) design (Zhou
To facilitate practitioners to use our single-stage and two-stage designs for a phase II single-arm study, including Jung
The Biostatistics Core is supported, in part, by NIH Center Grant P30 CA022453 to the Karmanos Cancer Institute at Wayne State University. WKK was partially supported by NIH Grant R01GM1076 39. The content is solely the responsibility of the authors and does not necessarily represent the official views of NIH.
Single-stage and two-stage (optimal and minimax) designs with
PET( |
||||||||||
---|---|---|---|---|---|---|---|---|---|---|
0.3 | 0.1 | Single | 5 | 28 | 0.047 | 0.142 | ||||
Optimal | 2 | 6 | 5 | 27 | 14.82 | 0.58 | 0.049 | 0.196 | ||
Minimax | 4 | 23 | 5 | 26 | 23.16 | 0.95 | 0.045 | 0.199 | ||
0.4 | 0.2 | Single | 10 | 36 | 0.045 | 0.168 | ||||
Optimal | 4 | 11 | 13 | 43 | 20.48 | 0.70 | 0.049 | 0.198 | ||
Minimax | 5 | 13 | 10 | 35 | 20.77 | 0.65 | 0.050 | 0.192 | ||
0.5 | 0.3 | Single | 14 | 37 | 0.049 | 0.193 | ||||
Optimal | 7 | 15 | 17 | 43 | 23.50 | 0.70 | 0.050 | 0.196 | ||
Minimax | 11 | 23 | 14 | 37 | 27.74 | 0.66 | 0.048 | 0.199 | ||
0.6 | 0.4 | Single | 20 | 42 | 0.038 | 0.197 | ||||
Optimal | 9 | 16 | 23 | 46 | 24.52 | 0.72 | 0.049 | 0.199 | ||
Minimax | 17 | 34 | 19 | 39 | 34.44 | 0.91 | 0.049 | 0.198 | ||
0.7 | 0.5 | Single | 23 | 39 | 0.050 | 0.168 | ||||
Optimal | 10 | 15 | 28 | 46 | 23.63 | 0.72 | 0.050 | 0.197 | ||
Minimax | 13 | 19 | 23 | 39 | 25.69 | 0.67 | 0.046 | 0.196 | ||
0.8 | 0.6 | Single | 24 | 35 | 0.034 | 0.195 | ||||
Optimal | 10 | 13 | 31 | 43 | 20.58 | 0.75 | 0.050 | 0.200 | ||
Minimax | 14 | 18 | 23 | 33 | 22.25 | 0.72 | 0.046 | 0.199 | ||
0.9 | 0.7 | Single | 20 | 25 | 0.033 | 0.194 | ||||
Optimal | 9 | 10 | 24 | 29 | 15.01 | 0.74 | 0.047 | 0.195 | ||
Minimax | 14 | 15 | 20 | 25 | 19.51 | 0.55 | 0.033 | 0.198 |
PET = probability of early termination.
Single-stage and two-stage (optimal and minimax) designs with
PET( |
||||||||||
---|---|---|---|---|---|---|---|---|---|---|
0.3 | 0.1 | Single | 4 | 21 | 0.086 | 0.152 | ||||
Optimal | 2 | 6 | 4 | 20 | 11.88 | 0.58 | 0.090 | 0.194 | ||
Minimax | 3 | 15 | 4 | 19 | 15.51 | 0.87 | 0.092 | 0.199 | ||
0.4 | 0.2 | Single | 7 | 24 | 0.096 | 0.189 | ||||
Optimal | 4 | 11 | 10 | 31 | 16.93 | 0.70 | 0.100 | 0.192 | ||
Minimax | 5 | 11 | 7 | 24 | 17.93 | 0.47 | 0.093 | 0.199 | ||
0.5 | 0.3 | Single | 11 | 28 | 0.092 | 0.191 | ||||
Optimal | 6 | 12 | 13 | 32 | 19.74 | 0.61 | 0.090 | 0.195 | ||
Minimax | 8 | 15 | 11 | 28 | 21.50 | 0.50 | 0.090 | 0.199 | ||
0.6 | 0.4 | Single | 15 | 30 | 0.097 | 0.175 | ||||
Optimal | 7 | 12 | 20 | 38 | 20.70 | 0.67 | 0.098 | 0.195 | ||
Minimax | 10 | 16 | 14 | 28 | 21.67 | 0.53 | 0.099 | 0.197 | ||
0.7 | 0.5 | Single | 18 | 30 | 0.084 | 0.181 | ||||
Optimal | 10 | 15 | 20 | 32 | 19.73 | 0.72 | 0.100 | 0.196 | ||
Minimax | 9 | 12 | 17 | 28 | 20.12 | 0.49 | 0.095 | 0.198 | ||
0.8 | 0.6 | Single | 17 | 24 | 0.089 | 0.192 | ||||
Optimal | 10 | 12 | 18 | 25 | 17.74 | 0.56 | 0.099 | 0.185 | ||
Minimax | 12 | 14 | 17 | 24 | 19.52 | 0.45 | 0.087 | 0.198 | ||
0.9 | 0.7 | Single | 15 | 18 | 0.098 | 0.165 | ||||
Optimal | 7 | 7 | 15 | 18 | 12.74 | 0.48 | 0.089 | 0.200 | ||
Minimax | 7 | 7 | 15 | 18 | 12.74 | 0.48 | 0.089 | 0.200 |
PET = probability of early termination.