2. Methods
Patients
We included 878 patients with CRS who underwent nasal endoscopic surgery
in the Department of Otolaryngology, Head and Neck Surgery of our
hospital, from October 2016 to June 2021. The study protocol was
approved by the institutional review board, which waived the requirement
for informed consent. The inclusion criteria were as follows: diagnosis
of chronic rhinosinusitis based on the European Position Paper on
Rhinosinusitis and Nasal Polyps 2020 and having undergone a sinus CT
scan within 2 preoperative weeks. The exclusion criteria were as
follows: fungal sinusitis identified on pathological examination; sinus
cystic fibrosis; unclear CT images; posterior nostril obstruction; a
history of radiotherapy for the head and neck.
2.2 Histological examination
Intraoperative nasal polyps and pathological nasal mucosal tissues were
selected for fixation, embedding, and sectioning. Sections underwent
conventional hematoxylin-eosin staining, followed by observation under a
high-magnification field (HPF) of 400 ×. In each field, ten fields were
randomly selected for observation and the eosinophils and inflammatory
cells were counted. The mean counts of eosinophils and inflammatory
cells were calculated for each field. ECRS was diagnosed when the
eosinophil to inflammatory cell count ratio (Eos%) was ≥ 10%, and
NECRS was diagnosed when the Eos% was <10%[18].
2.3 Image collection and Pre-processing
The patients underwent sinus scanning with 64-slice spiral CT using the
following parameters: tube voltage, 120 kV and tube current, 200
mA. There was a soft tissue window (window width: 300–350 HU, window
position: 30–50 HU) and bone window (window width: 1000–2000 HU,
window position: 300–350 HU). Exported CT images were saved in the
DICOM format and converted to the PNG format for segmentation and
classification model training; additionally, the slices of axial CT
images with lesions were used to build the dataset.
To establish segmentation dataset, 1,365 images were randomly selected
from patients with ECRS and NECRS; additionally, the nasal cavity and
sinus regions in the image were marked using the ITK-SNAP software. We
defined the nasal cavity and sinus regions as follows: the inferior side
is the plane where the bilateral inferior turbinates begin to appear,
while the superior side is the plane where the straight gyrus and
orbital gyrus of the frontal lobe begin to appear. At the maxillary
level, the anterior side is the anterior nostril or nasal bone, the
lateral sides are the anterior lateral wall and posterior wall of the
maxillary or lateral wall of the sphenoid, and the posterior side is the
posterior nostril or posterior wall of the sphenoid. At the ethmoid
level, the anterior side is the nasal or frontal bone, the lateral sides
are the lateral walls of the ethmoid and sphenoid, and the posterior
side is the posterior wall of the posterior ethmoid or sphenoid (Figure
1).
The dataset of the classification model comprised 56,892 images,
including ECRS (343 patients and 22,671 images) and NECRS (535 patients
and 34,221 images). Patients in each category were allocated to the
training and validation cohorts at a ratio of 4:1. Since all sinus CT
slices did not show between-disease differences and we sought to achieve
accurate classification, we constructed two datasets using each patient
and each image as a unit. When using individual images, each image was
labeled and input into the classification network for learning. When
using patients as units, we labeled each patient, set the average
probability value of all the images obtained from each patient as the
patient’s probability value, and input it into the model for learning.
2.4 Network Architecture
Our compiling platform was based on the Pytorch library (version 1.9.0)
with CUDA (version 10.0) for GPU (NVIDIA T4) acceleration on a Windows
operating system (Server 2019 data center version 64 bit). We
transformed the U-Net and Deeplabv3 networks to build semantic
segmentation models. Additionally, 1,365 images were used to construct
datasets, which were randomly divided into the training and validation
cohorts at a ratio of 4:1. The model was trained using the RMSprop
optimizer, with the batch size and initial learning rate set at 32 and
0.001, respectively. Both semantic segmentation models were trained for
20 epochs. We selected the model with the best performance and used a
rectangular segmentation method to segment the nasal cavity and sinus
areas on the CT images (Figure 2).
Since different neural networks may have different preferences for the
data distribution, type, and dataset size, we used four common
pre-trained classification networks for model building, including
efficientnet_b0, resnet50, inception_resnet_v2, and Xception neural
networks, to avoid model inclination. These networks were trained using
the SGD optimizer with a batch size of 32; furthermore, each model was
trained for 40 epochs.
Statistical Analyses
Statistical analyses were performed using SPSS22.0 statistical
software. Normally distributed measurement data are expressed as
(X ± S ) and were analyzed using an independent sample t-test.
Counting data are expressed as frequencies and were analyzed using the
chi-square test. Statistical significance was set at P< 0.05. Segmentation model performance was evaluated using
Dice similarity coefficients, and classification model was evaluated
using the ROC curve, accuracy, and confusion matrix; moreover, Grad-Cams
were generated by extracting feature maps from the final convolutional
layers to verify the reliability of the model.