"Statistical Difference" Does NOT Mean Clinically Important!An Evaluation of Orthopaedic Trauma Randomized Trials

Session I - Polytrauma

Thurs., 10/5/06 Polytrauma, Paper #5, 2:45 pm

"Statistical Difference" Does NOT Mean Clinically Important! An Evaluation of Orthopaedic Trauma Randomized Trials

Jinsil Sung, MD¹ (n); Judith Siegel, MD¹ (n); Paul Tornetta, MD¹ (n);
Mohit Bhandari, MD² (n);
¹Boston Medical Center, Boston, Massachusetts, USA;
²McMaster University, Hamilton, Ontario, Canada

Introduction/Purpose: Randomized trials are the highest level of evidence in comparing treatment methods. However, the results of such studies are important only to the extent that they affect clinical decision-making, primarily by improving patient outcomes. The purpose of this study is to evaluate the clinical relevance of randomized trials demonstrating a statistical difference in their results using the established methods of "effect size" and "relative risk reduction." Effect size is a statistical method that determines how great the difference in outcome is between groups with respect to the natural variation within a group, and gives a general assessment of importance of the finding; it is used for continuous variables (time to union, SF-36 scores, etc). Relative risk reduction describes the ratio of avoidance of complications between groups and is used for dichotomous variables (nonunion, infection, etc).

Methods: All randomized controlled trials (RCTs) focused on orthopaedic trauma in adults between 1/1/95 to 12/31/04 that reported at least one statistically significant result between groups were evaluated. Baseline characteristics and treatment effects were abstracted by two reviewers. For continuous outcome measures, we calculated effect sizes (mean difference/standard deviation). Dichotomous variables were summarized as absolute risk differences and relative risk reductions (RRRs). Based on accepted standards, effect sizes >0.80 (from Cohen) and RRRs >50% were defined as large effects.

Results: Our search yielded 433 RCTs, of which 265 papers were excluded by application of our eligibility criteria. Thus, 168 RCTs were eligible. 92 studies did not report basic means, standard deviations, or proportions among treatment groups to calculate a treatment effect. Our final analysis included 76 RCTs with sufficient data on 185 outcomes (121 continuous, 64 dichotomous outcomes). The mean effect size across studies was 1.7 ± 1.6 (range, 0.1-10), with 30% of significant study findings having an effect size
of <0.8. For dichotomous outcomes, the median RRR = 66%, with 47% not reaching 0.5. Additionally, we identified a strong correlation (r = -0.80, P <0.001) between increasing treatment effect and decreasing number of study outcome events, indicating that smaller studies were more likely to result in higher treatment effects, further diminishing their importance.

Discussion/Conclusion: A large percentage of RCTs that report "statistically different" results are likely NOT "clinically important." Additionally, smaller studies more likely resulted in very large treatment effects that may overestimate the true effect of surgical interventions. Surgeons should be aware that statistical significance does not prove clinical relevance!

If noted, the author indicates something of value received. The codes are identified as a-research or institutional support; b-miscellaneous funding; c-royalties; d-stock options; e-consultant or employee; n-no conflicts disclosed, and *disclosure not available at time of printing.

· The FDA has not cleared this drug and/or medical device for the use described in this presentation (i.e., the drug or medical device is being discussed for an "off label" use). · · FDA information not available at time of printing.