II. Related works
Artificial neural network(1), which was addressed in last 50s, is one type of network topologically composed of multiple artificial ‘neurons’ so as to mimic the way how human being learns and acts. In last 80s, convolutional neural network was designed for computer vision tasks such as hand writing digits recognition. By the enormous development of computing hardware and the boost of the Internet, multiple layers’ CNN(2, 3), also known as deep CNN, was pushed in front of the stage in the past decade due to its extraordinary performance achieved in computer vision contests.
To automatically distinguish the standard fetal anatomical scan planes, several works have been presented for 2-D ultrasound images or videos. In(4), Active Appearance Models are utilized to identify if the composite structure of butterfly-shaped thalami and the falx is appeared in the scan planes, and a score function is applied for evaluating the correctness of the planes with detected structure. (5)proposed to use a hybrid model which composites of both convolutional neural network (CNN) and long short term memory(6) (LSTM) model to locate fetal standard planes in ultrasound videos. In(7), a CNN is proposed for identifying fetal abdominal standard planes in ultrasound videos. A fisher vector based model is presented in (8) for the recognition of fetal facial standard planes in ultrasound images. A CNN based model which is called SonoNet(9) is presented for the real-time detection of fetal standard planes while the deepest SonoNet involves 13 convolutional layers. In (10), a 16 convolutional layers based CNN is suggested to be able to recognize three types of fetal facial standard planes. In(11), the information extracted from both cropped regions of fetal structures and the whole ultrasound image via CNN are suggested to be fused in order to identify fetal standard planes. Besides, a few works have been carried out for the localization of fetal standard planes in 3-D ultrasound volumes(12, 13).