. . . and when is enough proof enough?
In clinical and community practices, there is an increased emphasis on evidence-based treatments for physical, emotional, and social issues. Generally, the term “evidence” refers to data or information, relevant to a question or issue, obtained from experience, or observational or experimental trials (Jenicek, 2010). Evidence is not necessarily correct, complete, satisfactory, or useful. The strength or utility of evidence depends upon the process used to gather the data, the sample and outcome measures on which the data are based, and the context in which the data were collected. In behavioral intervention research, the outcomes of a study or evidence for a treatment can be evaluated according to different criteria: effectiveness with respect to important outcomes, relevance, feasibility, cost versus benefit, equivalence to usual care, and sustainability. Many interventions have empirical evidence from a randomized clinical trial to indicate if they are efficacious with respect to specified outcomes. However, a question that often arises with respect to the translation and then widespread implementation of the treatment is: What type and level of evidence is sufficient, adequate, and generalizable to community settings and clinical practices?
For example, consider a skills-training intervention for spousal caregivers of persons with dementia that is compared to an educational/information provision control group. Assume that 120 caregivers were randomly assigned to either the skills-training group (n = 60) or the education/information group (n = 60). After the 6-month intervention period, caregivers who received the skills group experienced on average a 2-point drop in depression (as measured by the Center for Epidemiologic Studies Depression scale (CES-D)) as compared to those who received education/information, which was statistically significant (p < .05). Although the data provide reliable evidence that there was a difference in change in depression between the groups and suggest that providing caregivers with skills training is efficacious in terms of reducing depressive symptoms, statistical significance does not necessarily imply that the caregivers demonstrated improvements in their mood that is meaningful to their everyday lives. Similar comments can be made for the effect size statistic, which is a measure of the strength of the relationship between a treatment and an outcome. Even large effect sizes do not necessarily mean that the results are clinically meaningful or of practical importance with respect to everyday living. Thus, overall, there are some shortcomings associated with implementing treatments that are based solely on statistical findings related to differences between treatment groups.
In response to the limitations of statistical tests, researchers have been focusing on developing methods for identifying practically or clinically meaningful outcomes (Kazdin, 2008). To date, much of the discussion of clinical significance has been within the realm of psychotherapy or clinical medicine. For example, Schulz and colleagues (2002) conducted a review of intervention studies aimed at improving the lives of caregivers with dementia. They found that, although many studies reported small-to-moderate statistically significant effects on a broad range of caregiver outcomes, only a small portion of studies reported clinically significant outcomes. The authors concluded that the assessment of clinical significance in addition to statistical significance is needed in the domain of caregiver intervention research.
In this regard, the topic of clinical significance is receiving more attention in the broader intervention literature, given the increased emphasis on evidence-based treatments and the higher bar that is being established for evidence—achieving outcomes that are not only statistically significant but also meaningful and practically relevant. As discussed by Kazdin (2008), apart from statistical issues, other concerns related to the choices of outcome measures are the extent to which they are sensitive to change and capture functioning in everyday life. We also discuss this issue in Chapter 15 as it relates to the objective measurement of cognition and daily function. The essential question is: To what extent are changes in standardized measures of cognition related to changes in an individual’s ability to perform everyday activities such as managing medication and financial management tasks?
There is increasing recognition of the importance of evaluating the clinical meaning of statistically significant changes brought on by an intervention. It is no longer good enough, so to speak, to find statistical group differences and attribute them to a treatment. Thus, the objectives of this chapter are to (a) define the construct of clinical significance, (b) review the currently available methods for measuring clinical significance, and (c) discuss strategies for maximizing clinical significance. Our overall goal is to extend our understanding of methods for evaluating the effectiveness of interventions in order to advance the quality of intervention research; enhance the likelihood that treatments are implemented in community and clinical settings; and, perhaps most importantly, ensure that intervention programs are improving lives of individuals, families, and communities in meaningful ways.