4. Discussion
The recent advances in computer vision technology have allowed rapid development of artificial intelligence technology for image processing, automatic recognition, classification, and segmentation. This has led to more efficient extraction of large amounts of feature information from medical images. Medical image evaluation is not limited to qualitative disease diagnosis; rather, it also includes the acquisition and analysis of rich quantitative information to provide data regarding disease severity, optimal treatment options, and patient outcomes. In our study, we introduced semantic segmentation and classification networks to achieve effective classification of ECRS and NECRS based on preoperative sinus CT images. In contrast to patients with NECRS, patients with ECRS require repeated administration of corticosteroid therapy and multiple revision surgeries to achieve disease control [19]. Specifically, the therapeutic strategy involves local treatment through high-volume corticosteroid irrigation in a widely open surgical cavity [20, 21]. Given the high postoperative recurrence rate and drug resistance of ECRS, targeted biotherapeutics targeting the TH2 inflammatory mediators interleukin (IL)-5, IL-4, and IgE have been recently developed as a potential therapeutic approach [22, 23]. Increased eosinophil infiltration in the nasal polyps is an important biomarker for asthma development after nasal endoscopy [24]. Given the differences in surgical modalities and therapeutic agents between ECRS and NECRS, as well as the risk of postoperative asthma and recurrence, accurate preoperative diagnosis of ECRS is crucial for determining the optimal treatment plan. Our model showed satisfactory performance and could provide valuable information for accurate diagnosis and treatment.
We introduced a semantic segmentation model that could automatically segment the sinus and nasal area from a complex image for classification model learning. Medical image reading requires systematic anatomical knowledge. Accordingly, an excellent radiologist should master the anatomical and disease characteristics, including hidden prior knowledge regarding medical images. Traditional machine learning classification models, including the classic cat and dog recognition model, label images and input them into the network for training. Here, pixels representing the animal can appear anywhere in an image; further, the disease distribution is usually located in the corresponding anatomical region. Identifying the corresponding region for model learning can eliminate surrounding interference factors and prevent failure resulting from the lack of prior knowledge. We previously confirmed that a training method based on anatomical partitioning could effectively improve model performance and interpretability when the dataset was reduced [25]. Additionally, rectangular segmentation was used as the segmentation method. Here, irregular images should be filled with ”0” pixels around them and they visually appear as black. Compared with irregular segmentation, rectangular segmentation retains the structure around the sinus, which is more consistent with the real situation.
Previous studies on DL application in the medical field have mostly labeled single images and input them into the network for learning. However, medical images are unique. Specifically, for a patient with a certain classification feature, not all image slices contain information for classification. For example, since tumors are heterogeneous, the patient’s prognosis or risk of metastasis cannot be attributed to each lesion slice. Similarly, the features in some CT image slices might not show differences between patients with ECRS and NECRS. The classification results of a single image may have insufficient predictive utility. Each patient has multiple images; moreover, image classifications may differ for the same patient, which affects the outcomes. We used the traditional learning method of labeling individual images and attempted to label each patient. Our findings demonstrated that the dataset composed of each patient as a unit allowed significantly better model performance than the dataset composed of a single image as a unit. This confirms the hypothesis that not all slices show between-disease differences; moreover, it demonstrates the correctness and reliability of constructing datasets for model learning based on each patient as a unit.