Traditional hand-crafted features for representing local image patches are evolving into current data-driven and learning-based image feature, but learning a robust and discriminative descriptor which is capable of controlling various patch-level computer vision tasks is still an open problem. In this work, we propose a novel deep convolutional neural network (CNN) to learn local feature descriptors. We utilize the quadruplets with positive and negative training samples, together with a constraint to restrict the intra-class variance, to learn good discriminative CNN representations. Compared with previous works, our model reduces the overlap in feature space between corresponding and non-corresponding patch pairs, and mitigates margin varying problem caused by commonly used triplet loss. We demonstrate that our method achieves better embedding result than some latest works, like PN-Net and TN-TG, on benchmark dataset.
Just Accepted
This article has successfully passed peer review and final editorial review, and will soon enter typesetting, proofreading and other publishing processes. The currently displayed version is the accepted final manuscript. The officially published version will be updated with format, DOI and citation information upon launch. We recommend that you pay attention to subsequent journal notifications and preferentially cite the officially published version. Thank you for your support and cooperation.