Methods
Data resource
The data was collected from the First Hospital of Jilin University in
China from January 2018 to June 2019. Patient identifiers were removed,
and all data were anonymized. Every pregnant woman in this study gave
written informed consent. All of the examinations were performed and
diagnosed by a team of well-trained doctors from center for prenatal
diagnosis of First Hospital of Jilin University. The GE Volution E8
ultrasound scanners were included for data acquisition. The study
protocol was approved by Ethics Committee of the First Hospital of Jilin
University (Changchun, China; permit No. 2018-429).
Deep learning algorithms for
training
Picking out brain images
The first step of our scheme is to pick out brain images from all stored
freeze-frame images (Figure 1). This can be done using a classification
deep learning model. Several famous models have been proposed and shown
good results in image classification related tasks, such as Oxford VGG
model [41], Google Inception model [42] and Microsoft ResNet
model [43]. Here we chose to apply transform learning using
ResNet50, a 50-layer Residual Network, which showed good results for
medical image classification [44].
Specifically, each image was first resized to 224×224 and then enter
RetNet50 layer with the pretrained IMAGENET weights.
GlobalAveragePooling2D is applied to the output of the last
convolutional layer, following a classic fully connected dense layer
with sigmoid activation. All layers are set as trainable, which means
they can be updated with back propagation in each step.
The experiment was carried out with Jupyter Notebook, in an environment
of Keras, using TensorFlow as backbend. A workstation with four NVIDIA
GeForce GTX 1080 Ti graphics cards, two Intel Xeon E5-2620 v4 CPUs and a
Random Access Memory (RAM) of 64GB was used in this experiment. The
labels were determined manually by one trained expert.
Picking out TV and TT planes and localization of brain
region
After picking out all fetal brain images, we need
to further pick out US images in
the
transventricular
(TV) or
transthalamic
(TT) plane, in which the lateral ventricle can be measured. Moreover, we
need to localize the brain region and remove the background around the
fetal skull, which will influence the results hugely. Here we used
Faster R-CNN [45], a state-of-the-art object detection algorithm,
which combines the localization task and classification task together.
Specifically, we used the fasterrcnn_resnet50_fpn model in torchvision
to perform the experiments, which uses resnet50 as the backbone. The
network parameters were initialized from a model pretrained on the COCO
dataset. To make the algorithm more robust, we augmented the dataset by
randomly cropping, flipping images and rotating images by angles of 90,
180 or 270 degrees, to simulate various fetal positions. The network
would output zero, one or more detected objects and corresponding
likelihood percentages. The first object, which has the largest
percentage, was selected as the result.
The experiment was carried out with Jupyter Notebook, in an environment
of torch. The system used in this experiment was the same with the first
step. The bounding boxes of brains were manually labeled by one trained
expert and reviewed by doctors. The US images in the TV and TT plane
were selected by doctors.
Predicting the lateral ventricular
width
A regression model was applied to do this task. Specifically, each brain
region image will first resize to 224×224 and then enter RetNet50 layer
with the pretrained IMAGENET weights. GlobalAveragePooling2D is applied
to the output of the last convolutional layer, following a classic fully
connected dense layer with linear activation. All layers were set as
trainable, which means they can be updated with back propagation in each
step. mean_squared_error was specified as the loss function when
compiling the model. The experiment setting and the system used in this
experiment was the same with the first step. The truth lateral
ventricular width of each image was determined manually by doctors.
Interpretation of the results using heat
maps
To provide evidence that our regression model predicting the lateral
ventricular width of brain images was based on the anatomical structure
of lateral ventricle, we implemented heat maps for visualization and
interpretation. Here we used a technique called Class Activation Mapping
(CAM) [46] to generate the heat maps. After superimposing the heat
map to the grayscale image, we can see the key area, or the red color
regions, that the algorithm activated most.