Dense Match Summarization for Faster Two-view Estimation

Lund University
CVPR 2025
Image depicting the paper's main idea

Top: Relative pose estimation from dense matches gives small pose error, but is slow.
Bottom: Estimation from summarized matches is significantly faster, with similar pose error.

Abstract

In this paper, we speed up robust two-view relative pose from dense correspondences. Previous work has shown that dense matchers can significantly improve both accuracy and robustness in the resulting pose. However, the large number of matches comes with a significantly increased runtime during robust estimation in RANSAC. To avoid this, we propose an efficient match summarization scheme which provides comparable accuracy to using the full set of dense matches, while having 10-100x faster runtime. We validate our approach on standard benchmark datasets together with multiple state-of-the-art dense matchers.

Method

As input our method takes a large number of dense matches, typically \(N = 10\ 000\). First, we cluster the matches into \(K \ll N\) clusters.

Then, for each cluster, we perform the following steps:

  1. Approximate the Sampson error using the cluster centroid \((\boldsymbol{c}, \boldsymbol{\bar{c}})\)
  2. \[ f_{\text{cluster}}(\mathbf{E}; \mathcal{C}) = \sum_{(\boldsymbol{x}, \boldsymbol{\bar{x}}) \in \mathcal{C}} \frac{(\boldsymbol{\bar{x}}^T \mathbf{E} \boldsymbol{x})^2}{\|\mathbf{E}_{12}\boldsymbol{x}\|^2 + \|(\mathbf{E}^T)_{12}\boldsymbol{\bar{x}}\|^2} \approx \frac{1}{\|\mathbf{E}_{12}{\boldsymbol{c}}\|^2 + \|(\mathbf{E}^T)_{12}{\boldsymbol{\bar{c}}}\|^2} \sum_{(\boldsymbol{x}, \boldsymbol{\bar{x}}) \in \mathcal{C}} (\boldsymbol{\bar{x}}^T \mathbf{E} \boldsymbol{x})^2 \]
  3. Rewrite the sum as a matrix product
  4. \[ \sum_{(\boldsymbol{x}, \boldsymbol{\bar{x}}) \in \mathcal{C}} (\boldsymbol{\bar{x}}^T \mathbf{E} \boldsymbol{x})^2 = \left\| \begin{pmatrix} \boldsymbol{\bar{x}}_1^T \mathbf{E} \boldsymbol{x}_1 \\ \vdots \\ \boldsymbol{\bar{x}}_n^T \mathbf{E} \boldsymbol{x}_n \end{pmatrix} \right\|^2 = \left\| \begin{pmatrix} (\boldsymbol{x}_1 \otimes \boldsymbol{\bar{x}}_1)^T \\ \vdots \\ (\boldsymbol{x}_n \otimes \boldsymbol{\bar{x}}_n)^T \end{pmatrix} \mathrm{vec}(\mathbf{E}) \right\|^2 \]
  5. Use Cholesky factorization to replace the \(N \times 9\) matrix \(\mathbf{A}\) with a \( 9 \times 9\) matrix \(\mathbf{M}\)
  6. \[ \left\| \mathbf{A} \boldsymbol{e} \right\|^2 = \boldsymbol{e} ^T \mathbf{A} ^T \mathbf{A} \boldsymbol{e} = \boldsymbol{e} ^T \mathbf{M} ^T \mathbf{M} \boldsymbol{e} = \left\| \mathbf{M} \boldsymbol{e} \right\|^2 \]

BibTeX

@inproceedings{astermark2025dense,
  author    = {Astermark, Jonathan and
               Heyden, Anders and
               Larsson, Viktor},
  title     = {Dense Match Summarization for Faster Two-view Estimation},
  booktitle = {Computer Vision and Pattern Recognition (CVPR)},
  year      = {2025}
}