Classical and Computational Algebraic Geometry in Computer Vision

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

A camera is a linear projective map $\PP^3\to\PP^2$ which can be represented by a full rank matrix $A\in\PP(\RR^{3\times 4})$. Within the context of computer vision, we refer to points $q\in\PP^3$ as world points and points $p\in\PP^2$ as image points. A multi-view arrangement is a collection of camera $A_i$, world points $q_j$ and image points $p_{ij}$ satisfying $A_iq_j=p_{ij}$. Within computer vision we are of the concerned with the problem of reconstructability. That is, given partial information for a multi-view arrangement, can the rest of the arrangement be reconstructed and, if so, is that reconstruction unique? The $7$-point algorithm is a classical $3$-D image reconstruction algorithm for two-view geometry, using $7$ paired points $(x_i,y_i)\in\PP^2\times\PP^2$ in two images to reconstruct the original cameras which produced them. This is done by constructing the $7\times9$ matrix $Z$ with rows $(x_i^\top\otimes y_i^\top)$ and producing the rank $2$ matrices in its nullspace. Each choice of rank $2$ matrix determines a possible arrangement of cameras, unique up to change of coordinates in $\PP^3$. Generically, there will be exactly three rank $2$ matrices in this nullspace, but this will not always be the case. In particular, this algorithm will be ill-posed if the $7$ tensors $(x_i^\top\otimes y_i^\top)\in\PP^8$ are linearly dependent. We fully characterize the geometric conditions on $\{(x_i,y_i)\}_{i=1}^k$ under which $k$ tensors $(x_i^\top\otimes y_i^\top)\in\PP^8$ will be linearly dependent for $2\leq k\leq 9$. For low values of $k$ we use computational software to analyze the conditions under which linear dependence occurs. For $k=6$, the answer is in terms of the geometry of cubic surfaces and the blowing up of $\PP^2$ in $6$ points. For $k=7$ and $8$ the answer is in terms of Cremona transformations and cubic curves; this utilizes a special correspondence we discover between possible $3$-D reconstructions of our images, lines normal to the span of the tensors $\{(x_i^\top\otimes y_i^\top)\}_{i=1}^k$, and Cremona transformations sending $x_i\mapsto y_i$ for all $i$. For all values of $k$ barring $k=6$, the geometry can be characterized as the existence of some special morphism sending $x_i\mapsto y_i$ for all $i$. For $k=6$ no morphism exists, but the two pointsets $\{x_i\}_{i=1}^6$ and $\{y_i\}_{i=1}^6$ can be seen as duals in a highly geometric sense via the blowups. This thesis also considers the related problem of resectioning: that of reconstructing the original cameras $A_i:\PP^3\to\PP^2$ given the world points $q_j\in\PP^3$ and the image points $p_{ij}\in\PP^2$. This problem is in some sense dual to the more well-studied problem of triangulation: reconstructing the world points $q_j\in\PP^3$ given the cameras $A_i:\PP^3\to\PP^2$ and the image points $p_{ij}$. We make use of Carlson-Weinshall duality, a framework for interchanging the roles of cameras $A_i$ and world points $q_j$ to adapt the methodology for studying triangulation in \cite{agarwal2022atlas} to prove similar results for the dual problem of resectioning. In particular, we find a universal Gr\"obner basis for the vanishing ideal of the associated resectioning variety. We also use Carlsson-Weinshall duality to produce a coordinate-free view of the atlas for the pinhole camera in \cite{agarwal2022atlas}. This atlas is reduced in the formal sense of taking quotients and we find that the reduced resectioning and reduced triangulation varieties are isomorphic.

Description

Thesis (Ph.D.)--University of Washington, 2024

Citation

DOI

Collections