CDNet: Cross-Domain Description and Detection for 2D - 3D Learning Local
AbstractBy integrating descriptors with keypoint detectors, we can unlock the
possibility of achieving comprehensive image matching and retrieval,
which presents significant implications for various modern applications
in computer vision and image processing. Despite the numerous proposals
for learning-based feature detection and keypoint description in 2D or
3D, the matching between 2D images and 3D point clouds has not received
thorough investigation. In this work, we propose a two-branch fully
convolutional network framework that maps 2D images and 3D point clouds
into a latent space for feature description and feature point detection.
Our model leverages two parallel branches, one for extracting features
from 2D images and the other from 3D point clouds, while facilitating
information exchange through weight sharing. This approach enables us to
fully exploit the correlations between 2D images and 3D point clouds,
enhancing the expressive power of features and achieving more accurate
and robust performance in image matching and retrieval tasks.
Additionally, we have designed a novel loss function to enhance
descriptor performance and enable more accurate keypoint detection.
Finally, we extensively evaluate our model on the SceneNN and 3DMatch
datasets, demonstrating its strong performance in accurate and efficient
2D-3D image matching. Our findings have significant implications for
various applications, including augmented reality, autonomous
navigation, and 3D reconstruction.