SFGAN: Unsupervised Generative Adversarial Learning of 3D Scene Flow from the 3D Scene Self
Abstract
Scene flow tracks the three-dimensional (3D) motion of each point in adjacent point clouds. It provides fundamental 3D motion
perception for autonomous driving and server robot. Although the Red Green Blue Depth (RGBD) camera or Light Detection and
Ranging (LiDAR) capture discrete 3D points in space, the objects and motions usually are continuous in the macro world. That is,
the objects keep themselves consistent as they flow from the current frame to the next frame. Based on this insight, the Generative
Adversarial Networks (GAN) is utilized to self-learn 3D scene flow with no need for ground truth. The fake point cloud of the second
frame is synthesized from the predicted scene flow and the point cloud of the first frame. The adversarial training of the generator
and discriminator is realized through synthesizing indistinguishable fake point cloud and discriminating the real point cloud and the
synthesized fake point cloud. The experiments on Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI)
scene flow dataset show that our method realizes promising results without ground truth. Just as human, the proposed method can
identify the similar local structures of two adjacent frames even without knowing the ground truth scene flow. Then, the local correspondence can be correctly estimated, and further the scene flow is correctly estimated.