Joint Gaussian Deformation in Triangle-Deformed Space for High-Fidelity Head Avatars

Jiawei Lu1, Kunxin Guang2, Conghui Hao1, Kai Sun3, Jian Yang1, Jin Xie2, Beibei Wang*2,
1Nankai University 2Nanjing University 3China Mobile Zijin Innovation Institute
*Corresponding Author

Eurographics Symposium on Rendering (2025)
MY ALT TEXT

We propose Joint Gaussian Deformation in Triangle-Deformed Space, decoupling the complex deformation of Gaussian into two simpler deformations, which are much simpler to represent or learn, consisting of a learnable displacement map-guided Gaussian-triangle binding and a neural-based deformation refinement, achieving high-fidelity animation and high-frequency details of head avatars.

Abstract

Creating 3D human heads with mesoscale details and high-fidelity animation from monocular or sparse multi-view videos is challenging. While 3D Gaussian splatting (3DGS) has brought significant benefits into this task, due to its powerful representation ability and rendering speed, existing works still face several issues, including inaccurate and blurry deformation, and lack of detailed appearance, due to difficulties in complex deformation representation and unreasonable Gaussian placement. In this paper, we propose a joint Gaussian deformation method by decoupling the complex deformation into two simpler deformations, incorporating a learnable displacement map-guided Gaussian-triangle binding and a neural-based deformation refinement, improving the fidelity of animation and details of reconstructed head avatars. However, renderings of reconstructed head avatars at unseen views still show artifacts, due to overfitting on sparse input views. To address this issue, we leverage synthesized pseudo views rendered with fitted textured 3DMMs as priors to initialize Gaussians, which helps maintain a consistent and realistic appearance across various views. As a result, our method outperforms existing state-of-the-art approaches with about 4.3 dB PSNR in novel-view synthesis and about 0.9 dB PSNR in self-reenactment on multi-view video datasets. Our method also preserves high-frequency details, exhibits more accurate deformations, and significantly reduces artifacts in unseen views.

Pipeline

MY ALT TEXT

The key of our method is a joint Gaussian deformation which represents the 3D head deformation of Gaussians with two components (an explicit component and an implicit component). Specifically, in the explicit deformation component, Gaussians parameterized in the triangle space are bound to triangles guided by displacement maps, with attributes initialized synthesized pseudo view-based Gaussian prior module. Then, Gaussians are mapped to the deformed world space via triangle-Gaussian transformation. In the implicit deformation component, Gaussian positions, together with a spatial semantic feature encoded by a learnable triplane and the expression, are fed into a refinement network to predict offsets, leading to final refined deformed Gaussians.

Results

Demo

BibTeX


@inproceedings{
  title={Joint Gaussian Deformation in Triangle-Deformed Space for High-Fidelity Head Avatars.},
  author={Lu, Jiawei and Guang, Kunxin and Hao, Conghui and Sun, Kai and Yang, Jian and Xie, Jin and Wang, Beibei},
  booktitle={EGSR},
  year={2025}
}