OTA 1997 Posters - Foot & Ankle Fractures
Effects of Binary Decision Making on the Classification of Fractures of the Ankle
William L. Craig, III, MD, Douglas R. Dirschl, MD
Chapel Hill, North Carolina, USA
Introduction: Multiple authors have studied the interobserver reliability of various fracture classification systems, but few have investigated the decision-making process itself. The AO/ASIF comprehensive classification of fractures (CCF) is a common system used by orthopaedic traumatologists. This system requires the observer to choose between three possible choices at each decision point. The CCF has recently been modified to incorporate binary decision making. This modification requires the observer to choose between two possible choices at each decision point. The stated goal of using binary decision making is to produce more consistent results between observers. To our knowledge, this modification was made without any prior validation of the effectiveness of binary decision making.
Purpose: The purpose of this study was to evaluate the effect of binary decision making on the interobserver reliability using the AO/ASIF classification of fractures of the malleoli (segment 44) as a model.
Methods: Radiographs of 50 fractures of the ankle were classified by 6 observers. Observers included two PGY-2 residents, two PGY-5 residents, and two orthopaedic attendings experienced in the treatment of fractures. Each observer first classified the radiographs in random order using the original AO/ASIF system. While classifying the fractures the observers were allowed to reference a copy of the classification system. However, observers were not allowed to ask questions concerning the classification system. During the same session each observer then classified the same 50 radiographs in random order using the binary form of the AO/ASIF system. While using binary decision making, each observer was required to answer each binary decision prior to proceeding to the next decision. Classification of radiographs was limited to type and group. Interobserver reliability was assessed using Kappa statistics.
Results: The mean Kappa value for interobserver reliability for type only and type and group classification using the initial AO/ASIF system was 0.77 and 0.61, respectively. When binary decision-making was enforced, the mean Kappa values for type only and type and group were 0.78 and 0.62, respectively. There was no statistically significant difference between nonbinary and binary decision-making when comparing either classification as to type alone (p > 0.5) or as to type & group (p > 0.5). Elimination of the most inexperienced observers (PGY-2 residents) had no significant effect on the results.
Discussion: The interobserver reliability of the classification of fractures of the ankle has been previously reported. Neilson found the percentage agreement of the Lauge-Hansen classification of ankle fractures to be 61-65% and concluded this system was difficult to use in a reproducible manner. Thomsen compared the Lauge-Hansen and Weber systems reporting a mean Kappa value of 0.49 using the Lauge-Hansen system and 0.58 for the Weber system. The results of this investigation improve on those of Thomsen, finding a Kappa value of 0.77 for classification as type according to the original AO classification. In the current study, interobserver reliability for type and group classification of ankle fractures using either the original AO classification or its binary modification was substantial. However, although binary decision-making was strictly enforced in this investigation, its use did not improve interobserver reliability over that of the original AO system in the classification of ankle fractures.
Conclusion: The interobserver reliability of both the original AO/ASIF classification of fractures of the ankle and the new modification which employs binary decision-making is good. Although the interobserver reliability for type alone was better than type and group, both remained in the acceptable range. The results of this study cast doubt on the effectiveness of binary decision-making in improving the interobserver reliability in the classification of fractures. To our knowledge this is the first study comparing the initial AO/ASIF fracture classification system to the more recent AO/ASIF system utilizing binary decision-making. Further study of other fractures may help elucidate the effectiveness of binary decision- making on improving the interobserver reliability in the classification of fractures.