Results
272 patients were included in the final analysis. The database used in
the Leonardi et al. (19) and Espada et al. (18) papers
containing 204 patients was reduced to 194 after incomplete data was
identified in 10 cases. The database used in the Rao et al. (20)
paper included 78 patients. All were complete and therefore included.
Summary data is presented in Tables 1 and 2. Overall, AAGL stage by
three observers accurately predicted the corresponding AAGL surgical
complexity level in 175 – 180 of the 272 cases (64.3 – 66.2%). The
overall performance of the AAGL system in terms of kappa and weighted
kappa scores, accuracy, sensitivity, specificity, positive predictive
value, negative predictive value, positive likelihood ratio, and
negative likelihood ratio to predict AAGL level of laparoscopic surgical
complexity is summarised in Tables 3, 4 and 5. Best performance of three
observers for sensitivity, specificity, PPV and NPV (95% CI’s) for
stage 1 to predict complexity level A was 98.7% (96.8 to 100.5), 64.2%
(55.8 to 72.8), 77.0% (71.0 to 82.9) and 97.5% (94.1 to 100.9)
respectively. For stage 2 to predict level B was 30.4% (11.6 to 49.2),
95.6% (93.0 to 98.1), 35.3% (12.6 to 58.0) and 93.5% (90.5 to 96.6)
respectively. For stage 3 to predict level C was 10.0% (3.4 to 16.5),
94.8% (91.6 to 97.9), 42.% (19.9 to 64.3) and 71.5% (65.9 to 77.1)
respectively. For stage 4 to predict level D
was 95.0% (85.4 to 104.5),
91.7% (88.2 to 95.1), 47.5% (32.0 to 63.0) and 99.6% (98.7 to 100.4)
respectively. The performance of score thresholds 8, 15 and 21 for
predicting corresponding skill levels (A – D) is reported in Table 6,
and corresponding ROC curves are shown in Figure 1.