Six primary English-language research studies were identified that reported spaceflight drug stability data. Of these studies, two have published results. Among these studies, Du et al. (2014) performed the only study with controls, while the five other studies used opportunistic spaceflight samples and manufacturer-matched drugs from different manufacturing lots as comparators (see Table 3). Among these studies, only one is in published format.11 The four remaining investigations are non-peer reviewed NASA reports12–14 of which reanalyzed three medications that were the same samples initially tested and reported by Du et al. three years earlier.15
A discussion of the Du et al. study design is relevant to contextualize the analysis that follows. Du et al. evaluated eight medication kits, each consisting of thirty-three drug products containing thirty-nine APIs. Of these thirty-nine APIs, thirty-six (36) were assayed for API content (API content for two drug products were not reported). Two APIs were in more than one formulation (ciprofloxacin [3] and promethazine [3]), and three products were combination products containing two separate APIs (a fourth combination drug product containing two APIs, noted above, did not have assay results). Half the medication kits (4) were stored aboard ISS, and the remaining were stored terrestrially in an environmentally-controlled chamber at the Johnson Space Center (JSC). Across all of the medication kits, each drug product was from a single manufacturing lot (i.e., one lot per drug product); hence, all spaceflight drug samples were matched to control samples from the same manufacturing lot across all kits and time points. Terrestrial control samples were stored under temperature and humidity conditions comparable to the flight samples. Among the solid dosage forms, twenty-two of the twenty-four drugs were removed from the original manufacturer's container and repackaged in ridged polypropylene containers. These medication containers are not considered protective since atmospheric factors can permeate plastic containers at defined rates,16 and there is no record that packaging was sealed or met USP standards for vapor transmission. For all drugs, analytical chemistry testing measured only the amount of API in each formulation at each time point; no evaluation of degradation products was performed, and degradation mass balance was not determined.
Du et al. reported that mean API content in 25 out of 36 medications in the spaceflight group fell below USP standards for labeled API strength (i.e., "failed") after 880 days of storage, compared to 17 out of 36 in the control group at the same time point. Across time points, the number of formulations that failed to meet specifications for API content increased with more prolonged spaceflight exposure. These results are the basis for the conclusion by the study’s authors that "…a number of formulations tested had a lower potency or percent content of API after storage in space with a consistently higher number of formulations failing USP potency requirement after each storage period interval in space than on Earth”.8 This conclusion, based on dichotomization of quantitative API content results (i.e., pass/fail outcomes), has been repeatedly cited as evidence that latent factors associated with spaceflight increase the risk of drug failure. However, quantitative analysis of API content can provide more insight into the effect of spaceflight on drug stability. Across all drugs at the 13-day time point, the difference in API content between spaceflight samples is within ± 5% of control content for most drugs (34 of 36) (Fig. 1). At 880 days of spaceflight storage, 39% of flight-exposed drugs (14 of 36) remain within ± 5% of control potency, and no drug has a loss of API exceeding 10% of control amounts. Taken together, this supports a conclusion that the effect of spaceflight exposure on drug stability is relatively small (Table S2).
Intuitively, if spaceflight increases the rate of API degradation (i.e., accelerates chemical reaction rates of APIs), then it should be expected that the difference between terrestrial and spaceflight samples should be negligible at the earliest 13-day time point compared to the difference after 880 days of storage period.17 However, at the 13-day time point a surprising 50% of all drugs in the flight group (18 of 36) have significant (p < 0.05) decreases in API content compared to matched controls (Table S1). The mean difference between the flight and terrestrial samples at Day 13 is 1.18 ± 2.5%, which equates to a rate of API loss of 0.09%/day. By comparison, the mean deviation from controls at the 880-day time point was − 4.76 ± 3.01%, corresponding to a total mean change in potency of only 3.6% (4.76% − 1.18% = 3.6%) for a storage duration of 880 days - fully 60-times longer than the initial 13-day time point. In this regard, the ratio of degradation occurring from Day 14 to Day 880 is approximately 3-times the amount of degradation observed from Day 0 to Day 13, equating to a relative risk associated with spaceflight storage of 0. 045, which is much less than unity. This suggests that that a substantial portion of the observed drug degradation is contributed to by factors present prior to the earliest time point (Day 13), even though the duration of time after Day 13 accounts for 99.5% of the total period of spaceflight exposure. Hence, the loss of API reported for spaceflight drugs is at least partially attributable to factors other than an increased chemical reaction rate associated with prolonged spaceflight exposure.
Focusing on changes in API content occurring between study days 13 and 880 show that spaceflight and terrestrial drug potency are highly correlated (r = 0.894) showing nearly a 1:1 correspondence (Fig. 2). Linear regression of these paired terrestrial and spaceflight potencies yield a slope coefficient of 1.012, which is virtually equivalent to unity and offset only by the y-intercept of -4.64%. On a per-drug basis, the Day 13 and 353 samples are slightly greater than unity, while the Day 596 and 880 samples are slightly less than unity. This indicates that the loss of API over the course of the experiment, aside from the location of the y-intercept, is very similar for control and spaceflight samples overall.
Estimates of chemical degradation rates are crucial to enabling a mechanistic understanding of how API degradation is influenced by environmental conditions and to provide predictive insight for estimating drug strength over time. The FDA and the European Medicines Agency (EMA) assume that drug degradation rates are typically represented by linear kinetics; most commonly first-order.17, 18 Regression models can test the null hypothesis of equality of slope or intercept relative to a control sample. For stability testing of pharmaceuticals, the lower 95% confidence interval of a regression model is used to describe degradation rate as a function of time, and is used to predict the retest period and shelf life 18–20. For each drug in this study, rates of degradation are visualized as a series of scatter plots upon which fitted first-order curves (natural log of the response variable) for control and flight samples are superimposed (Fig. 3). From these plots, it can be observed that for many of the APIs, the control and spaceflight degradation curves are close to parallel, with the two curves primarily offset by variability in the location of the y-intercept (i.e., the API strength at time zero). These plots illustrate that, for most of the tested drugs, spaceflight contributes minimally to the degradation, as summarized numerically in Table S3, which also provides extrapolated estimates of API half-life under control and flight conditions, as well as an estimation of API remaining at three years.
Inspection of these rate relationships shows that spaceflight is generally associated with a small increase in the rate of API loss. Rate ratios comparing terrestrial and matching spaceflight samples greater than 1.0 (Table S3) indicate that the spaceflight rates of API loss exceeds that of matching control samples. For thirty (30) out of thirty-six (36) drugs, rate ratios were less than 2-fold, ranging from 0.69 to 1.97. Only 2 of 36 APIs exhibit spaceflight degradation rates exceeding 3-times the terrestrial control rate: ibuprofen tablets (7.02-fold), and lyophilized imipenem for injection (9.43-fold). Clavulanate was the only drug that was more stable under spaceflight conditions than terrestrial (rate ratio = 0.69).
Half-life is an intuitive metric for evaluating concentration-dependent loss of API over time. Calculated half-life estimates (Table S3) show that most of the spaceflight APIs (24 of 36) have half-lives exceeding a decade, with the remaining 12 drugs having half-lives shorter than ten years (controls are 31 and 5, respectively). Extrapolated, the rate of API content loss suggests that, under repackaging and storage conditions analogous to those currently used by NASA operationally, the mean potency remaining in either control or spaceflight drugs at the end of a three-year exploration space mission falls below 90% of the label strength. Although not ideal, most of the tested drugs would have adequate API content remaining to achieve therapeutic efficacy with increased dosages. It is noted that neither current nor previous drug repackaging practices are considered protective for vapor or light transmission as defined by USP21
Each paired set of flight and control drugs (i.e., within-drug comparisons) is independent of all other paired sets of medications (i.e., between-drug comparisons) and can therefore be collectively analyzed to estimate an overall effect of spaceflight on drug stability. Visual inspection of the individual fitted regression plots (Fig. 3) collectively suggests slope and intercept variability across APIs contributes to differences in API levels observed between control and spaceflight samples. The evaluation of terrestrial and spaceflight treatments is analogous to whether or not results from independently performed stability tests can be combined under FDA shelf-life stability testing guidance.17, 22 Linear mixed-effect regression models have been used as one approach for such drug stability evaluationshypotheses.23, 24 However, a fundamental assumption of the mixed-effect regression models is that slopes and intercepts for each entity (i.e. drug product) are random and normally distributed. Since the drugs tested by Du et al. were arbitrarily selected for testing based, in part, on heuristic operational considerations, the normality of random slopes should not be assumed. GEE models are an alternative approach that do not assume anything about random effects, but do account for cluster correlation for each drug over time. The use of an exchangeable correlation structure allows for a single correlation parameter for all pairwise responses within an API. Thus, the model provides a population-level estimate of longitudinal drug potency accounting for clustered correlation. Here, we assume that different APIs, and potentially different drug formulations containing the same API (e.g., tablet, injectable), have different susceptibilities for degradation over time. Hence the postulated GEE model includes a variable for storage time (in units of months of storage), a factor for the treatment group, and a clustering variable (drug API). In addition, an interaction term is also included in the model to account for the combined effects of storage time with the treatment group (flight vs. control). This interaction is mechanistically justified since storage time cumulatively increases exposure of an unprotected API to environmental factors (e.g., humidity, CO2, ionizing radiation), or combinations of factors. that individually or synergistically contribute to API degradation. This contribution was evaluated using interaction plots and model selection prior to including this effect as a GEE model parameter (Figure S2). The GEE model results show that time, and the interaction of time and storage conditions, are the most significant coefficients in the model, with the effect of spaceflight itself being a less significant contributor to degradation. The first-order degradation rate for APIs under terrestrial conditions is -0.00317/month (t½ = 219 months). These findings compare to a degradation rate of -0.00478/month (t½ =145 months) for spaceflight samples, which equated to a ~ 1.51-fold (51%) increased rate, or an additional rate of-0.0016/month over the baseline rate. Figure 4 compares the overall first-order degradation of all drugs stored terrestrially to similarly maintained control samples with pertinent GEE model coefficients provided in Table 1, and the marginal supporting effects are provided in Table S4. Converted to an arithmetic scale, this equates to an additional ~ 0.2% loss of API content per month relative to the terrestrial baseline when averaged over the total duration of the experiment. The cluster correlation is estimated to be 0.651 ± 0.0703, indicating substantive temporal concordance within a cluster (i.e., APIs), which strongly supports the GEE approach for modeling cluster correlation.
Table 1
GEE model regression coefficents1
Term | Coefficient | Standard Error | Wald Statistic | p-value |
Intercept | 4.61E + 00 | 6.66E-03 | 481502.6 | < 2e-16 |
Storage | -3.17E-03 | 3.11E-04 | 103.9 | 2.00E-16 |
Treatment | -1.27E-02 | 4.98E-03 | 10.2 | 0.0014 |
Interaction | -1.61E-03 | 1.79E-04 | 80.8 | 2.00E + 16 |
1. Model coefficients represent Ln(response) | | |
Drug "failure" occurs when the API content of a drug product does not meet the minimum percentage of labeled strength, which, in the United States, is established by USP drug specifications. The overall risk of drug failure is a concern of NASA for long-duration spaceflight, especially for deep space exploration missions where resupply may be difficult or impossible. USP specifications of drug API content are minimum API content thresholds that serve as dichotomous binary pass/fail classifiers. USP limits are based primarily on reasonably achievable manufacturing quality and analytical performance; they are not quantitative metrics of pharmacodynamic potency, therapeutic efficacy, or toxicological risk. Therefore, it is recommended that USP quality standards should not be treated as surrogates for therapeutic efficacy. In this paper, the intersection of the lower 95% confidence interval of measured API potency with the lower limit of the USP quality range is used as the threshold to dichotomously classify each drug product as pass or fail.
Failure time analysis focuses on when a failure event occurs rather than if an event has occurred, as in survival analysis. The risk of drug failure is the probability that a drug will fail to meet USP quality standards at any point, and this probability increases over time. Figure 5 illustrates the cumulative posterior median failure distribution of terrestrial and spaceflight drug samples uncorrected for the mean difference between control and spaceflight API strength discussed earlier. Both spaceflight and terrestrial conditions exhibit a rapid increase in failure probability during the earliest months of storage in this experiment. The risk of failure with spaceflight storage is superimposed upon, but lower in absolute value, than the baseline risk of failure observed for the terrestrial controls.
Overall, the time to failure for drugs exposed to spaceflight, based on assayed API potency, is approximately half (0.55) that of a drug under terrestrial conditions (95% CI = [0.37, 0.82], p = 0.0038) if a proportional hazard is assumed. From the Bayesian model, the median estimated time to failure was 28.6 months (95% CI, 21.0–40.9 months) for terrestrial storage and 15.3 (95% CI, 11.2–20.1 months) for spaceflight. Based on the posterior survival distributions, probabilistic failure estimates for specific storage times are provided in Table 2. Whether the probability of failure for the baseline terrestrial samples is increased due to environmental exposure as a result of repackaging or inherent chemical instability cannot be determined directly from this study since matching controls in unopened manufacturer packaging were not tested. It is important to reiterate that drug failure is related to each tested medication's potency relative to USP specifications, which does not necessarily equate to altered therapeutic efficacy.
Table 2
Probabilities of Drug failure through specific time durations
Median | Storage Duration (Months) |
Probability ± SD | 1 | 12 | 24 | 36 |
Terrestrial storage | 0.007 ± 0.005 | 0.188 ± 0.049 | 0.426 ± 0.072 | 0.627 ± 0.085 |
Spaceflight storage | 0.0017 ± 0.001 | 0.388 ± 0.066 | 0.729 ± 0.065 | 0.898 ± 0.049 |
In addition to the study by Du et al., five smaller descriptive opportunistic studies of spaceflight drug stability have been performed (Table 3).12–15 Among these studies, none include initial baseline API measurements prior to long-term spaceflight exposure or terrestrial lot-matched controls. Of these studies, only the study by Wotring (2016) is published, whereas the other four studies are non-peer reviewed NASA reports (extracted data are provided as described in the Data Availability section). The range of APIs tested among these five studies is much more limited than that of Du et al; however, there is a much greater focus on characterizing impurities, albeit without lot-matched terrestrial controls. Three of these studies include matched manufacturer controls for each spaceflight exposed medication, however, these controls are from different lots with different expiration dates,12–14and one study includes both unmatched and some lot-matched controls.15 Across the six studies (inclusive of Du et al.), a total of nine medications (Table 3, bolded) intersect with the list of medications tested by Du et al. Of these, ibuprofen is the most commonly tested drug, having been evaluated in four out of six spaceflight studies. Two medications are shared across the five studies in Table 3 that are not included in the Du et al. study (‡ superscript), with the remaining drugs having been evaluated in only a single study. The study by Wotring and Khan (2014) is distinct from the other four studies listed in Table 3 in that the three medications tested are the identical medications originally tested by Du et al. several years earlier. In this respect, these results are independent measurements on the same Du et al. samples but following a considerably longer period of post-flight terrestrial storage. Figure 6 summarizes data from all studies as scatter plots of mean API levels (± SD) for the nine drug products (eight APIs). A trend line incorporating all available data for each medication (blue line) is plotted to illustrate the overall pattern for loss of API content with slopes provided in Table S5. For reference, the trend lines for the matching control and spaceflight medications from Du et al. are also provided as described for Fig. 3. A key observation from these composite plots is the large variability in measured API content across studies.
Across all studies, five out of the nine intersecting drugs exhibit higher amounts of API in the follow-up studies than were reported by Du et al. at similar or earlier time points. The opportunistic studies of ibuprofen yield API percentages that bracket those reported by Du et al., with lower levels of API at all time points reported by Wu et al. (2016) and higher levels reported by both Wotring et al. (2015) and Cory et al. (2016). Both the oral and injectable dosage forms of promethazine were reported by Wu et al. (2016) to have lower amounts of API than was reported by Du et al. Among the nine composite models, spaceflight degradation rates are reduced in five models (i.e., the rate of degradation is slower) when all data were considered; only phenytoin exhibits an increase in estimated rate of degradation. Rate estimates for amoxicillin, ibuprofen and injectable promethazine (the latter being the only drug maintained its original manufacturer packaging) are relatively unchanged despite large variations in the data.