2. Methods
Patients
We included 878 patients with CRS who underwent nasal endoscopic surgery in the Department of Otolaryngology, Head and Neck Surgery of our hospital, from October 2016 to June 2021. The study protocol was approved by the institutional review board, which waived the requirement for informed consent. The inclusion criteria were as follows: diagnosis of chronic rhinosinusitis based on the European Position Paper on Rhinosinusitis and Nasal Polyps 2020 and having undergone a sinus CT scan within 2 preoperative weeks. The exclusion criteria were as follows: fungal sinusitis identified on pathological examination; sinus cystic fibrosis; unclear CT images; posterior nostril obstruction; a history of radiotherapy for the head and neck.
2.2 Histological examination
Intraoperative nasal polyps and pathological nasal mucosal tissues were selected for fixation, embedding, and sectioning. Sections underwent conventional hematoxylin-eosin staining, followed by observation under a high-magnification field (HPF) of 400 ×. In each field, ten fields were randomly selected for observation and the eosinophils and inflammatory cells were counted. The mean counts of eosinophils and inflammatory cells were calculated for each field. ECRS was diagnosed when the eosinophil to inflammatory cell count ratio (Eos%) was ≥ 10%, and NECRS was diagnosed when the Eos% was <10%[18].
2.3 Image collection and Pre-processing
The patients underwent sinus scanning with 64-slice spiral CT using the following parameters: tube voltage, 120 kV and tube current, 200 mA. There was a soft tissue window (window width: 300–350 HU, window position: 30–50 HU) and bone window (window width: 1000–2000 HU, window position: 300–350 HU). Exported CT images were saved in the DICOM format and converted to the PNG format for segmentation and classification model training; additionally, the slices of axial CT images with lesions were used to build the dataset.
To establish segmentation dataset, 1,365 images were randomly selected from patients with ECRS and NECRS; additionally, the nasal cavity and sinus regions in the image were marked using the ITK-SNAP software. We defined the nasal cavity and sinus regions as follows: the inferior side is the plane where the bilateral inferior turbinates begin to appear, while the superior side is the plane where the straight gyrus and orbital gyrus of the frontal lobe begin to appear. At the maxillary level, the anterior side is the anterior nostril or nasal bone, the lateral sides are the anterior lateral wall and posterior wall of the maxillary or lateral wall of the sphenoid, and the posterior side is the posterior nostril or posterior wall of the sphenoid. At the ethmoid level, the anterior side is the nasal or frontal bone, the lateral sides are the lateral walls of the ethmoid and sphenoid, and the posterior side is the posterior wall of the posterior ethmoid or sphenoid (Figure 1).
The dataset of the classification model comprised 56,892 images, including ECRS (343 patients and 22,671 images) and NECRS (535 patients and 34,221 images). Patients in each category were allocated to the training and validation cohorts at a ratio of 4:1. Since all sinus CT slices did not show between-disease differences and we sought to achieve accurate classification, we constructed two datasets using each patient and each image as a unit. When using individual images, each image was labeled and input into the classification network for learning. When using patients as units, we labeled each patient, set the average probability value of all the images obtained from each patient as the patient’s probability value, and input it into the model for learning.
2.4 Network Architecture
Our compiling platform was based on the Pytorch library (version 1.9.0) with CUDA (version 10.0) for GPU (NVIDIA T4) acceleration on a Windows operating system (Server 2019 data center version 64 bit). We transformed the U-Net and Deeplabv3 networks to build semantic segmentation models. Additionally, 1,365 images were used to construct datasets, which were randomly divided into the training and validation cohorts at a ratio of 4:1. The model was trained using the RMSprop optimizer, with the batch size and initial learning rate set at 32 and 0.001, respectively. Both semantic segmentation models were trained for 20 epochs. We selected the model with the best performance and used a rectangular segmentation method to segment the nasal cavity and sinus areas on the CT images (Figure 2).
Since different neural networks may have different preferences for the data distribution, type, and dataset size, we used four common pre-trained classification networks for model building, including efficientnet_b0, resnet50, inception_resnet_v2, and Xception neural networks, to avoid model inclination. These networks were trained using the SGD optimizer with a batch size of 32; furthermore, each model was trained for 40 epochs.
Statistical Analyses
Statistical analyses were performed using SPSS22.0 statistical software. Normally distributed measurement data are expressed as (X ± S ) and were analyzed using an independent sample t-test. Counting data are expressed as frequencies and were analyzed using the chi-square test. Statistical significance was set at P< 0.05. Segmentation model performance was evaluated using Dice similarity coefficients, and classification model was evaluated using the ROC curve, accuracy, and confusion matrix; moreover, Grad-Cams were generated by extracting feature maps from the final convolutional layers to verify the reliability of the model.