Volume 86, Issue 12, Supplement , Pages 8-15, December 2005
Another Look at Observational Studies in Rehabilitation Research: Going Beyond the Holy Grail of the Randomized Controlled Trial
Article Outline
- Abstract
- RCTs: features and challenges
- Observational data and causal inferences
- CPI: features and challenges
- RCT and CPI studies compared
- Discussion
- Conclusions
- Acknowledgments
- References
- Copyright
Abstract
Horn SD, DeJong G, Ryser DK, Veazie PJ, Teraoka J. Another look at observational studies in rehabilitation research: going beyond the holy grail of the randomized controlled trial.
This commentary compares randomized controlled trials (RCTs) and clinical practice improvement (CPI) approaches to study design, evaluates their relative advantages and disadvantages, and discusses their implications for rehabilitation research and evidence-based practice. Many argue that observational cohort studies are not sufficient as scientific evidence for practice change. We challenge this assertion by introducing the concept of a CPI study: a comprehensive observational paradigm structured to decrease biases generally associated with observational research. One strength of CPI studies is their attention to defining and characterizing the “black box” of clinical practice. CPI studies require demanding data collection, but by using bivariate and multivariate associations among patient characteristics, process steps, and outcomes, they can uncover best practices more quickly while achieving many of the presumed advantages of RCTs.
Key Words: Cerebrovascular accident , Clinical practice variations , Rehabilitation , Treatment outcomes
ARECURRING CRITICISM in medical rehabilitation is the lack of adequate high-level research evidence with which to establish evidence-based practice. This criticism is not unique to rehabilitation and is echoed throughout the health care system. Tunis et al write,
The current clinical research enterprise in the U.S. is not consistently producing an adequate supply of information to meet the needs of clinical and health policy decision makers … [due to] a systematic problem in the production of clinical research… . A consistent finding of [systematic literature] reviews is that the quality of evidence available to answer the critical questions identified by experts is suboptimal… . These gaps in evidence undermine efforts to improve the scientific basis of health care decisions… . [such that] clinical practice guidelines may not be able to develop clear, specific recommendations. [Typical] observational and other non-experimental methods may not provide sufficiently robust information regarding the comparative effectiveness of alternative clinical interventions, primarily because of their high susceptibility to selection bias and confounding variables.1(p1625-6)
Tunis calls for new research methods to meet these gaps. Berguer2 discusses problems with the evidence in evidence-based medicine (EBM). The main tools of EBM are randomized trials and meta-analysis, but Berguer believes that these methods are unlikely to lead to the discovery of new and best treatments for specific types of patients. “[Rigorous] observational and inductive clinical intelligence should be stimulated and published because a therapy needs to be invented before it is proven effective. Biomathematicians need to improve nonrandomized methodology as they did for randomized studies.”2(p265) To paraphrase Berguer, randomized controlled trials (RCTs) are important to the confirmation of new and/or current interventions and practices, not to the discovery of more effective and efficient interventions and practices.
There are additional calls for new approaches to EBM and performance in quality and costs of the health care system. Porter and Teisberg write, “The U.S. health care system has registered unsatisfactory performance in both costs and quality over many years.”3(p64) They observe that medical services are restricted or rationed, many patients receive poor care, and high rates of preventable medical errors persist. There are wide and inexplicable differences in costs and quality among providers and across geographic areas. In well-functioning competitive markets, Porter and Teisberg argue, such outcomes would be inconceivable; in health care these results are intolerable. Competition in health care is operating at the wrong level: payers, health plans, providers, physicians, and others in the system wrangle over the wrong things. “System participants divide value instead of increasing it.”3(p65) This form of zero-sum competition must be replaced by competition at the level of preventing, diagnosing, and treating individual conditions and diseases and determining the best treatments for specific types of patients. Encouraging competition at the level of treatments for specific diseases or co-occurring conditions and types of patients will speed the development of the right kind of information and improve value (quality of health outcomes per dollar expended). Value should be measured and improved at the disease and treatment level.3
Tunis’s call for developing the next phase in the evolution of clinical trials is, namely, pragmatic or practical clinical trials (PCTs), for which the hypothesis and study design are developed specifically to answer the questions faced by decision makers in practice or as payers. “Characteristic features of PCTs are (1) select clinically relevant alternative interventions to compare, (2) include a diverse population of study participants, (3) recruit participants from heterogeneous practice settings, and (4) collect data on a broad range of health outcomes.”1(p1624) “PCTs address practical questions about the risks, benefits, and costs of an intervention as they would occur in routine clinical practice”1(p1626) and address questions such as the following: Does the treatment work in the real world of everyday practice? For whom does the intervention work best? The PCT approach contrasts with explanatory clinical trials or efficacy studies (RCTs), which are concerned with questions such as the following: Does the investigational treatment cause an effect? How and why does the intervention work? Explanatory trials are designed to maximize the chance that some effect of a new or existing treatment will be revealed by the study. They are a form of confirmatory analysis where relations have been vetted already in previous research.
This commentary presents the clinical practice improvement (CPI) research method as a variant of the PCT called for by Tunis. As a clinical research method, CPI embraces all 4 elements of PCTs outlined above and, thus, is one way in which the PCT concept can be operationalized effectively.4 The purpose of this commentary is to juxtapose RCT and CPI research methods by evaluating their relative strengths and weaknesses. We argue that PCT methods such as CPI can liberate us from the straightjacket that has constrained rehabilitation’s ability to discover and establish standards for best practice.
RCTs: features and challenges
The intellectual origins of RCTs come from agriculture. In agricultural hothouses, the environment can be reasonably controlled and various interventions tested. The RCT represents a research paradigm that had its origins in a simpler time when we did not have powerful multivariate statistical tools, and even when we had them, we lacked the computational power that readily accessible computer-based statistical packages have brought us over the last 30 years. As a research model, the RCT allowed one to make relatively simple computations using fairly small sample sizes; it was well suited to the computational constraints of an earlier era. RCTs do not harness the full power of multivariate statistics, in which many variables can be considered simultaneously and covariates can be identified and neutralized to evaluate intervention effects.
A hallmark of an RCT is the random assignment of study participants into a treatment arm and a control arm to neutralize participant differences that might otherwise affect the outcome. By neutralizing participant differences through randomization, RCTs help isolate the effect of the treatment under review. Nonrandomized comparison groups present the risk that some nontreatment effects remain unaccounted for and thus compromise one’s ability to have full confidence that the outcomes are truly a consequence of the treatment or intervention under study.
When designed and conducted properly, RCTs are considered the criterion standard for establishing causality in scientific research. Clinical and health services research communities have come to accept hierarchies of evidence where RCTs are considered the highest level of evidence and anything less than RCT-level evidence is considered somewhat suspect. Using RCTs in rehabilitation presents several major challenges that are not easily overcome. We mention a few of them here and later discuss how a CPI approach is not bound by many of the same constraints.
Standardization and Artificiality
RCTs require that one use standardized treatment protocols and that one hold all other variables constant to isolate the effects of the intervention and to reduce noise in the data. One result is that the intervention setting can become artificial and may not reflect what would otherwise transpire under less-controlled circumstances in a real-world clinical environment. Standardized treatment protocols require extensive quality control to decrease error rates about the treatment. Treatment purity is difficult to maintain over time, across centers, and across clinicians; if compromised, an intention-to-treat analysis—which keeps everyone in the study and in their assigned groups even if the treatment protocol or control is not followed as prescribed—may be the best remaining analysis option. Unfortunately, intention-to-treat analyses no longer reflect efficacy.
Selection Criteria, Patient Recruitment, and Generalizability
Selection criteria for participation in a study are often quite restrictive to reduce variation stemming from differences among study participants. Restrictive selection criteria limit the generalizability of a study’s findings (external validity) to the types of people represented in the study. The study’s findings may not apply to the types of people excluded from the study. For example, many studies exclude people with comorbidities, although significant comorbidities are common in many rehabilitation populations and may affect or alter outcomes. Clinicians may be prone to dismiss RCT findings, because they deem their patients to be quite different from those seen in a clinical trial. Restrictive selection criteria can also result in studies with very small numbers, drawn from a much larger pool of otherwise eligible participants. Typically, only a small percentage of patients—usually 10% to 15%—are eligible for a trial. Enormous resources must then be expended to recruit large pools of potential participants to locate people who meet the selection criteria and thus achieve the sample sizes needed to power the analyses.
Blinding
RCTs assume some degree of blinding. Ideally, all 3 actors—the study participant, the clinician, and the researcher or observer—are unaware as to whether the participant is in the treatment or control arm. Double blinding means that 2 of the 3 are blinded—both the participant and 1 of the other 2 actors—lest their knowledge about participant assignment affect their level of effort, their outlook, and the participant’s willingness to continue with the nontreatment arm of the study. Rehabilitation interventions, including sham interventions, are not easily disguised and, in many cases, are impossible to disguise.
RCTs present other challenges, including ethical challenges to randomization and lengthy planning and approval processes that can sabotage even the best-designed studies. Most formidable is cost: an RCT can be very expensive, even prohibitively expensive, because it may require an elaborate protocol to screen patients, coordinate care, and collect data. For example, the Medical Outcomes Study and the Health Insurance Experiment conducted by Rand in the 1980s cost sponsor organizations more than $35 million and over $60 million in 2005 dollars. Other large RCTs of practice effects cost about the same.
All of this leaves rehabilitation in a real bind. On one hand, rehabilitation practice needs the validation that sound scientific evidence can provide. On the other hand, its highly customized multifactorial approach does not lend itself well to RCTs, which require a more limited set of interventions and selection criteria that can make participant recruitment difficult and expensive and make study findings less generalizable. We quickly could exhaust a good portion of the world’s entire biomedical research budget in a given year to study all the rehabilitation interventions and combinations of interventions used around the world. Over the years, many variants of the RCT have evolved to address 1 or more of the challenges identified but cannot overcome limitations that are inherent in an RCT.
Observational data and causal inferences
It is generally accepted that a decision is unwarranted if the supporting evidence is based on accidental associations. Confidence in an action depends on confidence that the supporting evidence implies a causal connection. As mentioned above, randomization underlying the RCT provides a relatively high degree of confidence in this regard, but the resulting evidence can be costly to obtain, not germane to the relevant clinical context, and easily compromised by small deviations from the design. An alternative is to use data with a naturalistic genesis representing the population and circumstances of interest. In such data, however, subjects are not randomized into the various treatment groups; consequently, analyses often cannot discern whether differences in outcomes are due to different treatments or to other differences between subject groups.
This problem has generated considerable effort to create methods that identify treatment effects using observational data. Since the last quarter of the twentieth century, a large literature has developed on causal inferences and observational data.5, 6, 7, 8, 9, 10, 11 Methods have been created that allow for unbiased estimates of treatment effects by controlling for unmeasured confounders.8, 11 Unfortunately, these methods cannot identify all treatment effects of interest and are often sensitive to assumptions that are not testable. Also, they require considerable knowledge in statistics to understand and adjust sufficiently for nuisances, making them less useful to researchers and less understandable to decision-makers who do not have the requisite statistical background.
Alternatively, methods that sidestep the issue of unobserved confounding have been developed as well. Specifically, the method of instrumental variables allows for estimation of treatment effects in the presence of otherwise unobserved confounding.12 However, the treatment effect is instrument-specific and may not be of interest. In addition, it can be difficult to identify and measure the required variables, and—similar to the preceding methods—the necessary assumptions are not testable. As another alternative, the observed data can be analyzed as if there are no unmeasured confounders and then subjected to a sensitivity analysis of potential confounding.7 To be useful, however, this approach requires assumptions regarding the unknown confounding, and little is gained if results are determined to be sensitive to assumptions.
With enough data, if all factors influencing the distribution of both the interventions of interest and the outcomes of interest are measured and controlled for in analysis, then treatment effects can be identified from observational data without the need for sophisticated statistical models and untestable assumptions. Unfortunately, when confounding factors are not controlled statistically, the treatment effect may not be distinguishable from spurious correlations. It has been shown that under some circumstances controlling for only a subset of confounders can generate greater bias than controlling for none.13, 14, 15 Heckman and Navarro-Lozano13 provide a formal development of the point. Intuition suggests that if a set of factors have counterbalancing correlations with the outcomes and treatments (ie, some positive and some negative), then controlling for a select few can throw off the balance and generate greater bias.
Because in real-world settings it is not likely that all confounders can be identified and measured, a researcher is faced with 3 options: (1) pursue a costly RCT that may not address the clinical context of interest, (2) embark on statistically sophisticated methods that trade one set of untestable assumptions (ie, the identification of all confounders) for another set of untestable assumptions (the necessary distributional or correlation assumptions underlying selection and instrumental variable models), or (3) report an analysis that does not account for confounding, mention the deficit as a limitation, and let the user beware.
However, if the goal is to produce useful information and reduce uncertainty for decision-makers, the situation may not be so constrained. We suggest a paradigm shift toward the pragmatic. As stated at the beginning of this section, it is generally accepted that the decision to pursue a course of action is unwarranted if based on evidence of an accidental association; consequently, structuring research to minimize the potential for accidental association will improve its usefulness.
Rather than focus on meeting conditions for statistically unbiased causal effect estimates, we propose designing observational studies that focus on minimizing the plausibility of alternative explanations while estimating the complex associations between treatments and outcomes within a specific context of care. The identified associations are not equated with causal parameters but nonetheless inform such judgments to the extent that the design minimizes alternative explanations. This is a process-oriented approach: the goal is to structure the design carefully to capture the salient information bearing on the research question. The proposed design trades uncertainty regarding generalizability in the case of the RCT, or uncertainty in necessary assumptions underlying the statistical methods mentioned above, for uncertainty regarding the potential for alternative explanations while explicitly minimizing the plausibility of such explanations. Also, the proposed CPI method is available for use by most researchers with access to the standard computational power of today’s personal computers and a knowledge of basic multiple regression techniques.
CPI: features and challenges
CPI harnesses the complexity presented by patient and treatment differences, offering a naturalistic view of treatment by examining what actually happens in the care process.4 It does not alter the treatment regimen to evaluate the efficacy of a particular intervention as one does in an RCT. The CPI approach offers the advantage of large numbers of patients—numbers that often cannot be attained in an RCT constrained by stringent selection criteria.
CPI is an observational study design whose measurement encompasses a comprehensive view of the care management process: (1) key patient characteristics, (2) all treatment and care processes, and (3) outcomes. All 3 classes of data are considered simultaneously (fig 1). This comprehensive measurement framework provides a basis for meaningful analyses of significant associations between process and outcome, controlling for patient differences.
CPI designs include detailed measures of patient factors (physiologic severity of illness and psychosocial abnormalities presented at each visit or each admission), care process factors (eg, medications, treatments, interventions), and outcome factors. It presents the resulting associations to clinicians, so they can evaluate objectively the effects of the treatments they give to similar patients. Without all 3 types of data (eg, if one has only process and outcome data, but not detailed patient data), clinicians cannot tell if the outcomes achieved are due to the process steps or to differences in patients’ illness severity levels.
Patient Factors
Patient factors are the key characteristics of the study population: demographic characteristics, specific indications for treatment, severity of illness, initial functional status, psychosocial factors, and others. A CPI design addresses a central feature in RCT design—namely, the need for randomization to neutralize the effect of patient differences. Randomization is used when patient differences cannot be taken into account adequately. On the other hand, CPI studies incorporate detailed information about patients and their needs and account for these differences through statistical analyses to control for patient differences. Detailed patient profile data include condition-specific physiologic data, such as those contained in the Comprehensive Severity Index (CSI).4, 16, 17, 18, 19, 20, 21 The CSI is described in detail in the article22 outlining the study’s methods and is a unique severity-of-illness measure used in CPI studies.
Care Process Factors
A process of care is a sequence of linked, usually sequential, steps designed to cause a set of desired outcomes to occur. The goal is to find a measurable factor that describes each major process step. Examples include which drugs are dispensed, what dose is used, and what rehabilitation therapies are performed and for how long. A data collection instrument records the process steps in detail, including timing and dates. Thus, CPI studies require that clinicians and researchers characterize fully and accurately the actual interventions used. The level of detail about processes and interventions contained in CPI studies is unique.
Outcome Factors
Processes of care are designed to achieve specific outcomes. Among the outcomes commonly assessed are condition-specific complications, condition-specific long-term medical outcomes (based on clinician assessment or patient self-report), patient functional status, patient participation in society, patient satisfaction, and cost. Outcome factors may be thought of as analogs to the assessment endpoints in an RCT.
To capture all of these factors, CPI studies entail the creation of a large study database that includes all the patient, process, and outcome variables of interest. Multivariate statistical methods are then used to compare alternative treatments while controlling for other variables that may be driving observed differences between treatments and outcomes. These statistical methods allow the researcher to examine relations far more complex than those using only 1 explanatory or treatment variable at a time. The coefficients of the significant independent variables in regression equations identify key process steps that, when controlling for patient factors, are associated with better outcomes.
The CPI focuses on application—that is, on actionable findings that can be implemented to improve the process of care and treatment outcomes. The focus on implementation also governs who is involved in the study design, what data are collected, what questions are answered during analyses, and who designs the protocols or improvements in routine practice. Thus, CPI studies place a premium on the participation of clinicians in the study design, study execution, analyses of data, and implementation of study findings. Those actually providing the care are involved in all phases of the project, and their involvement also facilitates the buy-in needed to implement the findings and the care improvement processes.
RCT and CPI studies compared
Table 1 compares RCT and CPI studies across several dimensions. We argue that CPI-like observational studies can help overcome some of the limitations that are inherent in RCTs. The conventional wisdom, however, is that RCT studies provide superior evidence relative to observational studies, yet there is growing empirical evidence that supports the use of well-designed observational studies akin to CPI studies relative to RCTs to discover what works best in medicine. Two studies23, 24 found that treatment effects from observational studies and RCTs were remarkably similar. Both studies concluded that they found little evidence that estimates of treatment effects in well-designed observational studies were either consistently larger than or qualitatively different from those obtained in RCTs. A third article found the same thing: comparing results on 45 topics with binary outcomes there was “very good correlation … between summary odds ratios of randomized and non-randomized studies
Table 1. RCT and CPI Studies Compared
| Variables | RCT | CPI |
|---|---|---|
| Patient variables | Patient eligibility and stratification factors | Patient eligibility and stratification factors |
| Eliminate patients who could bias results: comorbidities, more serious disease, etc | Use severity of illness to measure comorbidities and disease severity | |
| About 10%–15% of patients qualify | All patients qualify by measuring patient differences; none excluded | |
| Process variables | Treatment protocol | Measure or record all treatments and interventions |
| Specify explicitly every important element of the process of care for both treatment and control arms | Abstract information from charts based on existing practice | |
| Informed consent | Informed consent often not needed⁎ | |
| Outcome variables |
Powered for primary outcome Change based on evidence |
Many outcomes assessed Improvement based on evidence |
| Measurements/documentation |
Limited number of patient variables, treatments, outcomes measured Variables specified precisely for all patient, treatment, and outcome measures |
Comprehensive holistic framework Variables specified precisely for all patient, treatment, and outcome measures |
| Database | Limited to the variables needed | Comprehensive and detailed |
| Result | Efficacy | Effectiveness |
| Assigned causality | Association and assumed causality | |
| Hypotheses | Typically 1 hypothesis | Typically many hypotheses |
| Clearly defined at the start | Many and broad at the start | |
| Narrow and focused | Refined and new hypotheses generated by analytic findings | |
| Local knowledge | Not dependent on local knowledge | Depends on local knowledge; entails participation by practicing clinicians |
| Confounders | Assumed not relevant to study or outcome | Affect outcomes and are relevant to include |
⁎ Informed consent may not be required if there is no experimental intervention and if there are no data collected beyond what is ascertained from medical records and from reports prepared by clinician in the course of usual care. |
r=0.75, P<.001 for all studies,
r=0.83, P<.001 for prospective studies.”25(p821)
These studies concluded that well-designed observational studies do not systematically overestimate the magnitude of the effects of treatment as compared with those in RCTs on the same topic. In addition, “the popular belief that only RCTs produce trustworthy results and that all observational studies are misleading does a disservice to patient care, clinical investigation, and the education of health care professionals.”24(p1892)
CPI has the ability to identify important associations in many diagnostic groups. Table 2 gives examples of CPI studies and selected treatments that were associated with better patient outcomes, their positive impacts on patients, and their positive impacts on health care systems (eg, reduced length of stay and/or costs).26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37
Table 2. Examples of CPI Studies, Selected Findings, and Their Effects
| CPI Project | Selected Significant Findings | Associations | Implications |
|---|---|---|---|
| Abdominal surgery32 |
Early feeding (start within 48h after surgery) Sufficient feeding (>60% of protein and calorie needs) |
Shorter LOS Lower hospital cost | Even though they had higher average severity of illness, patients fed early and sufficiently had between 1.4 and 2.9 days shorter average LOS and between $1940 and $5281 lower average cost per case than patients fed either not early and/or not sufficiently |
| Abdominal surgery29 | Use of PCA pump | Higher rate of postoperative surgical wound infection | 10.7% infections for PCA users vs 4.0% for non-PCA users |
| National Pressure Ulcer Long-Term Care Study31 |
Disposable briefs Supplement use Combination medications | Fewer pressure ulcers | Less suffering and lower cost to treat in nursing homes |
| Formulary limitations in the elderly35 | Greater formulary limitations | Higher health care resource utilization—more doctor office visits, more ED visits, and more hospitalizations per year | Common cost-containment strategies are associated with higher health care resource utilization |
| Asthma drugs36 | Use of newer asthma drugs | Lower overall drug costs and fewer PCP visits per year | Common cost-containment strategies are associated with higher health care resource utilization |
| Diabetes study37 | Self-monitoring of blood glucose along with consistent provider discussion | Better serum glucose control and fewer hospitalizations | Monitoring alone is not sufficient; discussion of results with providers is essential |
| Infants hospitalized with RSV30 | 33–35wk gestational age infants hospitalized with RSV | Higher intubation and longer ICU and hospital LOS | Consider prophylaxis for 33–35wk gestational age infants |
Discussion
A key advantage of a CPI study is the naturalistic view of medical treatment that is provided by retrospective data recorded routinely by medical providers. This view is critical to determine implications of treatment alternatives. In everyday practice, patients are assigned to different treatments based on the provider’s medical judgment, patient compliance is not artificially influenced, and monitoring of results is based on the provider’s need for information about how a patient is doing. All these factors can affect the effectiveness of medical treatment. CPI analyses help the team evaluate current practices and use the results to develop evidence-based improvements. Changes to the process of care rest on clinical data rather than on clinical opinion.
This approach directly contrasts the approach of traditional RCTs. Because their participants are screened, selected, and subjected to scrutiny and intervention control beyond that occurring in everyday treatment, RCTs sometimes report results that are not broadly applicable in everyday medical treatment. For example, a recent study described a little-used 40-year-old drug, spironolactone, which was shown in a landmark clinical trial in 1999 to significantly reduce death and hospitalization for patients with congestive heart failure.38 There was a 4-fold jump over 18 months in prescriptions for the generic drug. That surge in use was accompanied by a tripling of hospital admissions and of deaths resulting from dangerous elevations of potassium. Many patients given the medicine likely would have been excluded from participating in the original clinical trial. The researchers noted that the new findings offer a provocative look at the difference between clinical trials and real-world medicine—and the potential dangers of applying trial results too widely. Patients in clinical studies typically are selected carefully to maximize the chance of showing a benefit and minimize side effects. Thus, trial patients represent only a subset of the types of patients doctors treat in their offices. Patients given the medicine in the aftermath of the 1999 study were on average 13 years older than participants in the original trial and were more likely to have diabetes. Also, the average dose in actual practice was 30mg, whereas 25mg was used in the study.38 CPI studies can provide evidence to determine those medications and interventions that work best for specific types of patients in real-world practice.
Another key advantage of CPI study methods is cost. Using existing data from medical records and computerized databases is generally much less costly than implementing a prospective RCT. Using retrospective data allows for a much larger number of observations that can be available for analysis and for further hypothesis generation and refinement.
Observational studies do not scientifically prove the causality of any underlying relations, but they can point to hypotheses that can be evaluated clinically. There are 3 ways to ascertain causality from CPI studies: (1) no added confounders cause the significant association to disappear, (2) a change in outcome follows a change in treatment as predicted by the CPI model,4 and (3) repeated studies on the same topic yield the same findings. In short, CPI studies have shown predictive validity because observations show that outcomes change as predicted when practices are changed to those associated with better outcomes in the CPI analyses.
An RCT cannot always be conducted in rehabilitation medicine when sufficient evidence of treatment efficacy does not exist to justify one, projected sample sizes are small, or the question cannot be studied with an RCT (eg, what is the role of psychologic disturbances in outcomes). However, safeguards and protections built around RCTs (other than randomization) can be used in research methodologies such as observational studies, thereby increasing the level of evidence provided by studies using research designs other than RCTs. CPI does this by developing a comprehensive database of patient, treatment, and outcomes variables.
Instead of being viewed as competitive or mutually exclusive, RCTs and CPI should be considered complementary. Practice effects of RCTs can be tested in CPI studies, and CPI can be a progenitor of new RCTs.
Today, data needed to conduct a CPI study typically are abstracted by hand from existing paper medical records or documented prospectively on standardized forms. In the future, hospitals will use computerized clinical information systems (CISs). Then, rather than relying on labor-intensive manual data abstraction, needed patient, process, and outcome data can be found electronically in hospitals’ CISs. The efficiency and logistics of this new data acquisition modality will make it easier and less costly to conduct iterative CPI studies to determine best practices. Also, the resulting research-based protocols can be programmed into a hospital’s CIS to alert clinicians to the most appropriate protocols or interventions needed to address specific combinations of patient signs and symptoms. This should result in more consistent implementation of clinical practice guidelines and the protocols suggested by such guidelines.
CPI studies constitute a rigorous form of quasi-experimental research. Although they are weaker than RCTs on internal validity, they are stronger on external validity. CPI studies better represent actual conditions of practice, and they usually cost less and take less time. Because they do not insist on homogeneous patient populations, they allow the inclusion of patients with comorbidities or complications. To avoid confounding the link between the interventions and outcomes, they measure relevant patient characteristics using severity assessment tools and statistically adjust for differences in patients. Further, they accommodate departures from rigid treatment protocols by carefully monitoring and measuring actual treatments; they then use these data in statistical analysis. Because this approach does not disqualify large numbers of patients, it facilitates the generation of the number of cases needed for comparisons. Using multiple regression and other statistical techniques, researchers test process steps that are associated with the quality and cost outcomes sought for different kinds of patients.
Although CPI studies tend to focus on short-term outcomes, these outcomes include effects that are noticeable and holistically important to patients rather than only those that are physiologically measurable through laboratory or other tests. CPI studies are designed to be replicated easily so they can be undertaken at multiple sites.
In a commentary on alternatives to RCTs for traumatic brain injury rehabilitation, Whyte states, “It appears nearly impossible to successfully apply observational designs when the factors leading to the applications of different treatments are strongly related to the patient’s perceived prognosis.”39(p1320) CPI adjusts for this by using condition-specific, physiologic-based measures of severity such as the CSI and other control variables.
Conclusions
The most appropriate design for a specific study depends on the nature of the research question and the type of knowledge that is needed. Methodology alternatives such as CPI do not replace RCTs, but rather provide additional sources of systematic outcomes information that improve on the anecdotal and informal knowledge base that underlies much of clinical practice. CPI studies used by clinical teams have enormous power to enable health care providers, managed care organizations, payers, and patients to evaluate current practice and improve clinical decision making. These studies answer questions in the real world, where multiple variables and factors can affect the outcomes.
Acknowledgments
We acknowledge the role and contributions of their collaborators at each of the clinical sites represented in the Post-Stroke Rehabilitation Outcomes Project: Brendan Conroy, MD (Stroke Recovery Program, National Rehabilitation Hospital, Washington, DC); Richard Zorowitz, MD (Department of Rehabilitation Medicine, University of Pennsylvania Medical Center, Philadelphia, PA); David Ryser, MD (Neuro Specialty Rehabilitation Unit, LDS Hospital, Salt Lake City, UT); Jeffrey Teraoka, MD (Division of Physical Medicine and Rehabilitation, Stanford University, Palo Alto, CA); Frank Wong, MD, and LeeAnn Sims, RN (Rehabilitation Institute of Oregon, Legacy Health Systems, Portland, OR); Murray Brandstater, MD (Loma Linda University Medical Center, Loma Linda, CA); and Harry McNaughton, MD (Wellington and Kenepuru Hospitals, Wellington, NZ). We also acknowledge the role of Alan Jette, PhD (Rehabilitation Research and training Center on Medical Rehabilitation Outcomes, Boston University, Boston, MA).
References
- . Practical clinical trials (increasing the value of clinical research for decision making in clinical and health policy) . JAMA . 2003;290:1624–1632
- . The evidence thing . Ann Vasc Surg . 2004;18:265–270
- . Redefining competition in health care . Harv Bus Rev . 2004;82:64–76 , 136
- In: Horn SD editors. Clinical practice improvement methodology (implementation and evaluation) . New York: Faulkner & Gray; 1997;
- . Causal inference in the health sciences (a conceptual introduction) . Health Serv Outcomes Res Method . 2002;2:189–220
- . Statistics and causal inference (a review) . Test . 2003;12:281–345
- . Observational studies . New York: Springer; 2002;
- . Micro data, heterogeneity, and the evaluation of public policy (Nobel lecture) . J Political Econ . 2001;109:673–748
- . Causal effects in clinical and epidemiological studies via potential outcomes (concepts and analytical approaches) . Annu Rev Public Health . 2000;21:121–145
- . The estimation of causal effects from observational data . Annu Rev Sociol . 1999;25:659–706
- . Models for sample selection bias . Annu Rev Sociol . 1992;18:327–350
- . Econometrics in outcomes research (the use of instrumental variables) . Annu Rev Public Health . 1998;19:17–34
- . Using matching, instrumental variables, and control functions to estimate economic choice models . Rev Econ Stat . 2004;86:30–57
- . The bias due to incomplete matching . Biometrics . 1985;41:103–116
- . Difficulties with regression analyses of age-adjusted rates . Biometrics . 1984;40:437–443
- A study of the relationship between severity of illness and hospital cost in New Jersey hospitals . Health Serv Res . 1992;27:587–606 ; discussion 607-12
- . Results of a collaborative quality improvement program on outcomes and costs in a tertiary critical care unit . Crit Care Med . 1999;27:1768–1774
- . Development of a pediatric age- and disease-specific severity measure . J Pediatr . 2002;141:496–503
- . Severity assessment in children hospitalized with bronchiolitis using the pediatric component of the Comprehensive Severity Index . Pediatr Crit Care Med . 2000;1:127–132
- . Measuring medical complexity during inpatient rehabilitation after traumatic brain injury . Arch Phys Med Rehabil . 2005;86:1108–1117
- . The relationship between severity of illness and hospital length of stay and mortality . Med Care . 1991;29:305–317
- . Applying the clinical practice improvement approach to stroke rehabilitation (methods used and baseline results) . Arch Phys Med Rehabil . 2005;86(12 Suppl 2):S16–S33
- . A comparison of observational studies and randomized, controlled trials . N Engl J Med . 2000;342:1878–1886
- . Randomized, controlled trials, observational studies, and the hierarchy of research designs . N Engl J Med . 2000;342:1887–1892
- Comparison of evidence of treatment effects in randomized and nonrandomized studies . JAMA . 2001;286:821–830
- . Treatment of depression in older primary care patients in health maintenance organizations . Int J Psychiatry Med . 1997;27:215–231
- Agitation and depression in frail nursing home elderly with dementia (treatment characteristics and service use) . Am J Geriatr Psychiatry . 2003;11:231–238
- . Intended and unintended consequences of HMO cost containment strategies (results from the Managed Care Outcomes Project) . Am J Manag Care . 1996;2:253–264
- Association between patient-controlled analgesia pump use and post-operative surgical site infection in intestinal surgery patients . Surg Infect (Larchmt) . 2002;3:109–118
- . Effect of prematurity on respiratory syncytial virus hospital resource use and outcomes . J Pediatr . 2003;143(5 Suppl):S133–S141
- The National Pressure Ulcer Long-Term Care Study (pressure ulcer development in long-term care residents) . J Am Geriatr Soc . 2004;52:359–367
- . Early and sufficient feeding reduces length of stay and charges in surgical patients . J Surg Res . 2001;95:73–77
- . The effect of practice variation on resource utilization in infants hospitalized for viral lower respiratory illness . Pediatrics . 2001;108:851–855
- . Complications in infants hospitalized for bronchiolitis or respiratory syncytial virus pneumonia . J Pediatr . 2003;143(5 Suppl):S142–S149
- . Formulary limitations in the elderly (results from the Managed Care Outcomes Project) . Am J Manag Care . 1998;4:1105–1113
- . Newness of drugs and use of HMO services by asthma patients . Ann Pharmacother . 2001;35:990–996
- Frequency of blood glucose monitoring in relation to glycemic control in patients with type 2 diabetes . Diabetes Care . 2002;25:245–246
- . Treatment of heart failure with spironolactone-trial and tribulations . N Engl J Med . 2004;351:526–528
- . Traumatic brain injury rehabilitation (are there alternatives to randomized clinical trials) . Arch Phys Med Rehabil . 2002;83:1320–1322
Supported by the National Institute on Disability & Rehabilitation Research (grant no. H133B990005) and the U.S. Army and Materiel Command (cooperative agreement award no. DAMD17-02-2-0032). The views, opinions, and/or findings contained in this article are those of the author(s) and should not be construed as an official Department of the Army position, policy, or decision unless so designated by other documentation.No commercial party having a direct financial interest in the results of the research supporting this article has or will confer a benefit upon the author(s) or upon any organization with which the author(s) is/are associated.
PII: S0003-9993(05)01183-4
doi:10.1016/j.apmr.2005.08.116
© 2005 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Volume 86, Issue 12, Supplement , Pages 8-15, December 2005

