AffineGlue: Joint Matching and Robust Estimation
Abstract
We propose AffineGlue, a method for joint two-view feature matching and robust estimation that reduces the combinatorial complexity of the problem by employing single-point minimal solvers. AffineGlue selects potential matches from one-to-many correspondences to estimate minimal models. Guided matching is then used to find matches consistent with the model, suffering less from the ambiguities of one-to-one matches. Moreover, we derive a new minimal solver for homography estimation, requiring only a single affine correspondence (AC) and a gravity prior. Furthermore, we train a neural network to reject ACs that are unlikely to lead to a good model. AffineGlue is superior to the SOTA on real-world datasets, even when assuming that the gravity direction points downwards. On PhotoTourism, the AUC@10{\deg} score is improved by 6.6 points compared to the SOTA. On ScanNet, AffineGlue makes SuperPoint and SuperGlue achieve similar accuracy as the detector-free LoFTR.
Community
Proposes AffineGlue for local feature matching and pose estimation: detects features with affine shapes (SuperPoint + AffNet), perform one too many matches for each point in source image (by SuperGlue), select one-to-one affine correspondences (AC) by single-point solver (find correspondences consistent with model and calculate score through inliers, then repeat - like a RANSAC loop). Homography estimation using only single AC and gravity prior; simultaneous matching and estimation. Has theoretical background for fundamental matrix, essential matrix (assumes that camera intrinsics are known), and its relation with local affine transforms (through epipolar lines and local planar assumptions/polynomial approximations); AC impose three constraints on essential matrix (one direct and two through normals of epipolar lines). Iterative sampling, estimation, and scoring; get top-k matches first and store in order of priority (like PROSAC); trains a NeFSAC for predicting probability of AC leading to an accurate model. Iterate through potential matches, select pairs with lowest point-to-model distance in first image, get score from correspondences; residual function can be Sampson distance or symmetric epipolar error. Estimate homography from single affine correspondence and gravity (vertical direction known in both images): we can decompose rotation matrix as transforms along Y axis (vertical/canonical axis), transforms to canonical rotation (also rotation matrices) obtained using Rodrigues formula, express the essential matrix (as composition of rotation matrices and translation/baseline vector), hidden variable approach (to get R and t) and least squares method to get normals and homography. 1 AC and gravity has best error histograms (tested in synthetic environment, against 2 AC and 4 PC). Practical experiments use AffNet to get affine shape, tests multi-stage detector and descriptor (KeyNet, DoG with HardNet, SOSNet) and single stage (R2D2, DISK, SuperPoint and SuperGlue) - tested on PhotoTourism, ScanNet, and HPatches; better results than MAGSAC++ (under different existing solvers); also has ablations (WaSH and MSER for affine shape, different matchers, etc.) on PhotoTourism and ScanNet. Proposed solver is very robust to error in gravity direction (ScanNet has high errors in gravity). Training on affine covariant features could be useful. From ETHz (PE Sarlin), CTU Prague (D Mishkin), Microsoft, HOVER Inc.
Links: arxiv, PapersWithCode
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper