4. Discussion
The recent advances in computer vision technology have allowed rapid
development of artificial intelligence technology for image processing,
automatic recognition, classification, and segmentation. This has led to
more efficient extraction of large amounts of feature information from
medical images. Medical image evaluation is not limited to qualitative
disease diagnosis; rather, it also includes the acquisition and analysis
of rich quantitative information to provide data regarding disease
severity, optimal treatment options, and patient outcomes. In our study,
we introduced semantic segmentation and classification networks to
achieve effective classification of ECRS and NECRS based on preoperative
sinus CT images. In contrast to patients with NECRS, patients with ECRS
require repeated administration of corticosteroid therapy and multiple
revision surgeries to achieve disease control [19]. Specifically,
the therapeutic strategy involves local treatment through high-volume
corticosteroid irrigation in a widely open surgical cavity [20, 21].
Given the high postoperative recurrence rate and drug resistance of
ECRS, targeted biotherapeutics targeting the TH2 inflammatory mediators
interleukin (IL)-5, IL-4, and IgE have been recently developed as a
potential therapeutic approach [22, 23]. Increased eosinophil
infiltration in the nasal polyps is an important biomarker for asthma
development after nasal endoscopy [24]. Given the differences in
surgical modalities and therapeutic agents between ECRS and NECRS, as
well as the risk of postoperative asthma and recurrence, accurate
preoperative diagnosis of ECRS is crucial for determining the optimal
treatment plan. Our model showed satisfactory performance and could
provide valuable information for accurate diagnosis and treatment.
We introduced a semantic segmentation model that could automatically
segment the sinus and nasal area from a complex image for classification
model learning. Medical image reading requires systematic anatomical
knowledge. Accordingly, an excellent radiologist should master the
anatomical and disease characteristics, including hidden prior knowledge
regarding medical images. Traditional machine learning classification
models, including the classic cat and dog recognition model, label
images and input them into the network for training. Here, pixels
representing the animal can appear anywhere in an image; further, the
disease distribution is usually located in the corresponding anatomical
region. Identifying the corresponding region for model learning can
eliminate surrounding interference factors and prevent failure resulting
from the lack of prior knowledge. We previously confirmed that a
training method based on anatomical partitioning could effectively
improve model performance and interpretability when the dataset was
reduced [25]. Additionally, rectangular segmentation was used as the
segmentation method. Here, irregular images should be filled with ”0”
pixels around them and they visually appear as black. Compared with
irregular segmentation, rectangular segmentation retains the structure
around the sinus, which is more consistent with the real situation.
Previous studies on DL application in the medical field have mostly
labeled single images and input them into the network for learning.
However, medical images are unique. Specifically, for a patient with a
certain classification feature, not all image slices contain information
for classification. For example, since tumors are heterogeneous, the
patient’s prognosis or risk of metastasis cannot be attributed to each
lesion slice. Similarly, the features in some CT image slices might not
show differences between patients with ECRS and NECRS. The
classification results of a single image may have insufficient
predictive utility. Each patient has multiple images; moreover, image
classifications may differ for the same patient, which affects the
outcomes. We used the traditional learning method of labeling individual
images and attempted to label each patient. Our findings demonstrated
that the dataset composed of each patient as a unit allowed
significantly better model performance than the dataset composed of a
single image as a unit. This confirms the hypothesis that not all slices
show between-disease differences; moreover, it demonstrates the
correctness and reliability of constructing datasets for model learning
based on each patient as a unit.