As pointed out earlier, the US FDA generally accepts evidence from a well-defined and reliable PRO instrument in appropriately designed trials to support a claim in medical product labeling. The role of a PRO endpoint (i.e., whether primary, key secondary, or exploratory) should be clearly prespecified in the trial protocol, including the statistical methods that would be used to analyze the data. Some of the characteristics of PRO instruments that are routinely reviewed by the US FDA include: instruments measurement properties; the concepts being measured; number of items, medical condition, and population for intended use; data collection method; respondent burden; recall period; and translation or cultural adaptation availability, among others (FDA 2009).
Definition of an appropriate PRO endpoint may involve a fixed time point or a suitable summary statistic across time points. The defined endpoint is expected to reflect the objective of the given analysis and will determine the type of statistical procedure to be used. For example, an ordinal or continuous PRO score at a fixed time point in an RCT may be analyzed using standard parametric (e.g., two-sample t-test or analysis of variance) or nonparametric (e.g., Wilcoxon rank-sum test or Kruskal-Wallis test) analyses. In situations where it is desired to adjust for potential imbalances in baseline scores, alternative approaches may be employed, including computation of the change from baseline or the percentage change from baseline for each patient, with subsequent comparison between arms based on an analysis of covariance (ANCOVA). Similarly, binary PRO scores at a fixed time point may be analyzed using chi-squared or similar tests, or a logistic regression incorporating relevant covariates.
Suitably defined summary measures can serve several purposes, including facilitating interpretation, selecting analytical approaches, and reducing dimensions by combining data across scales and/or time points into a single score. However, the choice of the summary measures should be done judiciously, taking into account the impact of any missing values and the potential loss of information in the process of constructing the measures. Commonly used examples include, the average, maximum, minimum, or last observed postbaseline score; slope across postbaseline scores; within-subject area under the curve (AUC); and within-subject time to reach a prespecified value.
In the interpretation of results on PRO endpoints, statistical significance alone may not be meaningful. Therefore, the claims about treatment benefits should be accompanied by a well-justified responder definition and other data-presentation tools. To facilitate the interpretation of results from the analysis of PRO data, alternative approaches have been proposed, including the anchor-based and distribution-based approaches (see, e.g., Cappelleri et al. 2013; Marquis et al. 2004; McLeod et al. 2011). An anchor-based approach attempts to link the targeted concept that the PRO is intended to measure to an anchor measure or indicator that is interpretable itself or lends itself to interpretation. Thus, while the anchor may or may not be another PRO measure, it is required to meet at least twocriteria: viz., be correlated with the targeted PRO, and be easy to interpret relative to the PRO of interest. Anchor-based methods include percentages based on thresholds. For example, when using incontinence diaries that also collect the number of incontinence episodes, the mean change in PRO scores corresponding to a 50% reduction in episodes may be used to define a responder. Similarly, when patients are blinded to treatment assignment, their assessment of change recorded at different times may be used to define a responder. Specifically, the difference in PRO scores corresponding to the change in ratings (better/same vs. worse) can serve to define a responder (FDA 2009).
Distribution-based approaches, often used as supportive tools, typically relate to the magnitude of a treatment effect, both at the individual and group levels (Alemayehu and Cappelleri 2012). Examples of distributedbased approaches for a group of patients include standard error of measurements (SEM), and cumulative distribution of response curves (FDA 2009). The US FDA encourages the use of the cumulative distribution function (CDF) of responses between treatment groups, including an application of the responder definition along the CDF curve at each level of response (FDA 2009).
Figure 4.2 illustrates a CDF plot in which the solid and dashed curves denote the distributions for the two treatment groups. Assuming negative change scores indicate improvement, for example, at a change score of -2 (i.e., a 2-point improvement), where higher scores represent worse condition, the difference in the corresponding percentage of subjects is A = 25%.
When composite endpoints are constructed combining the scores from multiple items or domains, there should be clarity about the interpretation of the associated results, since the results may depend on the relative importance of the components and the corresponding effect sizes. When a composite endpoint shows favorable results, the component-wise results should be presented to indicate the relative contributions to the favorable result.
In certain situations, PRO instruments can be used to capture safety data, especially when it is deemed important to elicit the information from the patient perspective. In cancer trials, there have been ongoing efforts to streamline and harmonize the collection of safety data as a central PRO concept. This includes the concept proposed the FDA Office of Hematology and Oncology Products (OHOP) (Kluetz et al. 2016), and the National Cancer Institutes Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE). The latter is especially considered a useful tool to standardize the assessment of symptomatic AEs from the patient perspective (NIH NCI 2019).
PROs have attracted considerable attention from regulatory agencies, payers, and pharmaceutical companies. Traditional clinical trials, which rely upon observer-reported outcomes, often fail to take into account the patients’ perspective and experience. As patients get more involved in clinical trials and in their own healthcare, they will seek to have greater voice and greater access to data from other patients on trials to make informed decisions about their treatment. Thus, collecting reliable data that reflect the patients’ perspectives is a critical component of drug development.
From a regulatory perspective, evidence gathered using a well-defined and reliable PRO instrument in appropriately designed trials can be used to support labeling claim. However, development and validation of a PRO instrument requires strict regulatory and psychometric requirements, which involve demonstration of the instrument’s ability to reliably measure the claimed concept in the patient population enrolled in the clinical trial. Another issue of concern, especially in oncology, is the reliability of PRO data from open-label studies. As a consequence, despite the growing focus on the importance of PRO data, there is some variation in the degree to which regulatory agencies view the acceptability of such evidence for label claims. For example, according to a recent study, compared to the US FDA, the EMA tends to be more likely to accept data from open-label studies and broad concepts such as health-related quality of life (Gnanasakthy et al. 2019).
One major barrier that limits the wider use of PROs by healthcare systems in general is the scarcity of best practices in the use of validated instruments, especially when the comparability of data collected from disparate sources is desired. One framework, mentioned earlier, is the Patient-Reported Outcomes Measurement Information System (PROMIS), which aims to enhance and standardize measurement of several selected PROs. While the PROMIS network is growing, and actively developing and validating PROs in several new domains, it is still far from getting acceptance by regulatory agencies.