Department of Computer and Information Science
University of Pennsylvania
Geometry-aware deep learning: Equivariant representations for 3D recognition, pose, and motion
Traditional convolutional networks have shown unprecedented success in supervised classification yet they are still vulnerable to 3D geometric transformations in their inputs. While data augmentation might alleviate the intrinsic lack of invariance it leads to networks of increased model complexity. We will show in this talk by showing that we can achieve equivariance to transformations by performing group convolution either by using canonical coordinates or by working in the spectral domain. Experiments validate our claim of lower complexity without sacrificing performance. When we want to infer 3D pose from 2D images, annotation is hardly possible and we have to rely on minimal supervision or geometry constraints. We show that building equivariant embeddings reduces 3D pose problems to a simple correlation operation avoiding, thus, supervised regression or spatial transformers. We conclude by showing that classical Structure from Motion problems can be solved with self-supervised hybrid pipelines where optical flow is learnt but classic geometry is used for the 3D estimation.
A pizza lunch will be served.