Learning a model of facial shape and expression from 4D scans
The field of 3D face modeling has a large gap between high-end and low-end methods. At the high end, the best facial animation is indistinguishable from real humans, but this comes at the cost of extensive manual labor. At the low end, face capture from consumer depth sensors relies on 3D face models that are not expressive enough to capture the variability in naturalfacial shape and expression. We seek a middle ground by learning a facial model from thousands of accurately aligned 3D scans. Our FLAME model(Faces Learned with an Articulated Model and Expressions) is designed to work with existing graphics software and be easy to fit to data. FLAME uses a linear shape space trained from3800scans of human heads. FLAME combines this linear shape space with an articulated jaw, neck, and eyeballs,pose-dependent corrective blendshapes, and additional global expression blendshapes. The pose and expression dependent articulations are learnedfrom4D face sequences in the D3DFACS dataset along with additional 4Dsequences. We accurately register a template mesh to the scan sequences and make the D3DFACS registrations available for research purposes. In total the model is trained from over33,000scans. FLAME is low-dimensional but more expressive than the FaceWarehouse model and the Basel Face Model.We compare FLAME to these models by fitting them to static 3D scans and 4Dsequences using the same optimization method. FLAME is significantly more accurate and is available for research purposes (http://flame.is.tue.mpg.de).
READ FULL TEXT