II. Related works
Artificial neural network(1), which was addressed in last 50s, is one
type of network topologically composed of multiple artificial ‘neurons’
so as to mimic the way how human being learns and acts. In last 80s,
convolutional neural network was designed for computer vision tasks such
as hand writing digits recognition. By the enormous development of
computing hardware and the boost of the Internet, multiple layers’
CNN(2, 3), also known as deep CNN, was pushed in front of the stage in
the past decade due to its extraordinary performance achieved in
computer vision contests.
To automatically distinguish the standard fetal anatomical scan planes,
several works have been presented for 2-D ultrasound images or videos.
In(4), Active Appearance Models are utilized to identify if the
composite structure of butterfly-shaped thalami and the falx is appeared
in the scan planes, and a score function is applied for evaluating the
correctness of the planes with detected structure. (5)proposed to use a
hybrid model which composites of both convolutional neural network (CNN)
and long short term memory(6) (LSTM) model to locate fetal standard
planes in ultrasound videos. In(7), a CNN is proposed for identifying
fetal abdominal standard planes in ultrasound videos. A fisher vector
based model is presented in (8) for the recognition of fetal facial
standard planes in ultrasound images. A CNN based model which is called
SonoNet(9) is presented for the real-time detection of fetal standard
planes while the deepest SonoNet involves 13 convolutional layers. In
(10), a 16 convolutional layers based CNN is suggested to be able to
recognize three types of fetal facial standard planes. In(11), the
information extracted from both cropped regions of fetal structures and
the whole ultrasound image via CNN are suggested to be fused in order to
identify fetal standard planes. Besides, a few works have been carried
out for the localization of fetal standard planes in 3-D ultrasound
volumes(12, 13).