CGA-PoseNet: Camera Pose Regression via a 1D-Up Approach to Conformal Geometric Algebra
We introduce CGA-PoseNet, which uses the 1D-Up approach to Conformal Geometric Algebra (CGA) to represent rotations and translations with a single mathematical object, the motor, for camera pose regression. We do so starting from PoseNet, which successfully predicts camera poses from small datasets of RGB frames. State-of-the-art methods, however, require expensive tuning to balance the orientational and translational components of the camera pose.This is usually done through complex, ad-hoc loss function to be minimized, and in some cases also requires 3D points as well as images. Our approach has the advantage of unifying the camera position and orientation through the motor. Consequently, the network searches for a single object which lives in a well-behaved 4D space with a Euclidean signature. This means that we can address the case of image-only datasets and work efficiently with a simple loss function, namely the mean squared error (MSE) between the predicted and ground truth motors. We show that it is possible to achieve high accuracy camera pose regression with a significantly simpler problem formulation. This 1D-Up approach to CGA can be employed to overcome the dichotomy between translational and orientational components in camera pose regression in a compact and elegant way.
READ FULL TEXT