Detection-segmentation convolutional neural network for autonomous
vehicle perception
Abstract
Object detection and segmentation are two core modules of an autonomous
vehicle perception system. They should have high efficiency and low
latency while reducing computational complexity. Currently, the most
commonly used algorithms are based on deep neural networks, which
guarantee high efficiency but require high-performance computing
platforms. In the case of autonomous vehicles, i.e. cars, but also
drones, it is necessary to use embedded platforms with limited computing
power, which makes it difficult to meet the requirements described
above. A reduction in the complexity of the network can be achieved by
using an appropriate: architecture, representation (reduced numerical
precision, quantisation, pruning), and computing platform. In this
paper, we focus on the first factor - the use of so-called
detection-segmentation networks as a component of a perception system.
We considered the task of segmenting the drivable area and road markings
in combination with the detection of selected objects (pedestrians,
traffic lights, and obstacles). We compared the performance of three
different architectures described in the literature: MultiTask V3,
HybridNets, and YOLOP. We conducted the experiments on a custom dataset
consisting of approximately 500 images of the drivable area and lane
markings, and 250 images of detected objects. Of the three methods
analysed, MultiTask V3 proved to be the best, achieving 99% mAP_50 for
detection, 97% MIoU for drivable area segmentation, and 91% MIoU for
lane segmentation, as well as 124 fps on the RTX 3060 graphics card.
This architecture is a good solution for embedded perception systems for
autonomous vehicles. The code is available at:
https://github.com/vision-agh/MMAR_2023.