Introduction
Abnormal uterine bleeding in premenopausal women is a common complaint
in five percent of the women who experiences complaints of abnormal
uterine bleeding. (2) Endometrial ablation (EA) is one of the treatment
options for this common problem. Due to the less invasive nature (lower
intra-operative complication risks, shorter recovery time, and lower
post-operative morbidity), and low costs of this procedure, this form of
treatment seems to be a less-invasive surgical treatment for menorrhagia
compared to hysterectomy (3–7). However, long-term follow up shows a
decrease in patient satisfaction and treatment efficacy. Due to
permanent relief, the more invasive hysterectomy remains the most
effective treatment of abnormal uterine bleeding (8–15).
According to literature, several factors prior to endometrial ablation
appear to have an influence on the success or failure-rate of this
procedure. Younger age, complaints of dysmenorrhea, parity above or
equal to five, a thicker pre-procedural endometrium, a duration of
menstruation above seven days, presence of an intramural leiomyoma on
transvaginal sonography, a history of sterilization or caesarean
section, and a longer uterine depth are some of the possible negative
influencing factors (1,2,8,9,11–18).
To optimize the counselling of patients with abnormal uterine bleeding,
a prediction model based on the combined influence of the
above-mentioned predictors could provide a better insight into the
individual prognosis of endometrial ablation. In times of personalised
medicine this can create better individual care leading to fewer
re-interventions, lower healthcare costs and more patient satisfaction.
With the use of a prediction model shared decision making can be
optimized (19).
For this reason Stevens et al.(1) developed two multivariate prediction
models to help counsel patients for failure of EA and for surgical
re-intervention within two years after EA. The developed prediction
models have a clinically acceptable c-index of 0.68 and 0.71
respectively. In addition, Stevens et al. is performing an external
validation of these two prediction models, using retrospective data of
similar patient groups in two non-university teaching hospitals in the
Netherlands. Results of these data will follow. In the field of
gynaecology, many prediction models are developed using multivariate
logistic regression as a standard approach, these are based on a
combination of various predictors that are significantly related to the
outcome of interest. However, this method cannot automatically estimate
the interconnection between predictors and in this way can overestimate
the influence of an individual predictor (20,21).
We were also interested in other statistical techniques of developing a
prediction model. In recent years machine learning (ML) methods have
been increasingly used in the development of clinical prediction models.
This method is a scientific discipline that focuses on models that
directly and automatically learn from data (20,22). Potential advantage
of the machine learning methods compared to the traditional statistical
strategies is the possibility of capturing complex, nonlinear
relationships in the data (23,24). ML computer algorithms use training
data with well-defined input and output variables. This gives the
opportunity to define a model with predictors which can be used for new
and similar data. Compared to statistical logistic regression models,
this can be done without a priori assumption of relevant variables (25).
Random forest is a machine learning method used for classification and
regression that operates by constructing a large ensemble of decision
trees on training data (22,23,26). Each tree in the random forest is
built using a bootstrap sample randomly drawn from the training dataset.
This results in a reduction of variance and corrects for a single
decision trees ability to overfit to a training set. Each tree in the
forest gives an individual prediction on the outcome measure. For a
classification problem (in this case, surgical re-intervention or no
surgical re-intervention after EA) the final random forest model
averages the prediction of all the trees in the forest (21,23,27).
The aim of the study was to develop a random forest prediction model to
predict the chance of surgical re-intervention within two years after
EA. Furthermore, it was our aim to compare the performance of the random
forest model with the prediction by previously published the
multivariate logistic regression model (1). In both models the surgical
re-intervention within two years after EA is used as primary outcome
measure.