Joint Gaussian Deformation in Triangle-Deformed Space for High-Fidelity Head Avatars

Jiawei Lu¹, Kunxin Guang², Conghui Hao¹, Kai Sun³, Jian Yang¹, Jin Xie², Beibei Wang^*²,

¹Nankai University ²Nanjing University ³China Mobile Zijin Innovation Institute
^*Corresponding Author
Eurographics Symposium on Rendering (2025)

Abstract

Creating 3D human heads with mesoscale details and high-fidelity animation from monocular or sparse multi-view videos is challenging. While 3D Gaussian splatting (3DGS) has brought significant benefits into this task, due to its powerful representation ability and rendering speed, existing works still face several issues, including inaccurate and blurry deformation, and lack of detailed appearance, due to difficulties in complex deformation representation and unreasonable Gaussian placement. In this paper, we propose a joint Gaussian deformation method by decoupling the complex deformation into two simpler deformations, incorporating a learnable displacement map-guided Gaussian-triangle binding and a neural-based deformation refinement, improving the fidelity of animation and details of reconstructed head avatars. However, renderings of reconstructed head avatars at unseen views still show artifacts, due to overfitting on sparse input views. To address this issue, we leverage synthesized pseudo views rendered with fitted textured 3DMMs as priors to initialize Gaussians, which helps maintain a consistent and realistic appearance across various views. As a result, our method outperforms existing state-of-the-art approaches with about 4.3 dB PSNR in novel-view synthesis and about 0.9 dB PSNR in self-reenactment on multi-view video datasets. Our method also preserves high-frequency details, exhibits more accurate deformations, and significantly reduces artifacts in unseen views.

Pipeline

The key of our method is a joint Gaussian deformation which represents the 3D head deformation of Gaussians with two components (an explicit component and an implicit component). Specifically, in the explicit deformation component, Gaussians parameterized in the triangle space are bound to triangles guided by displacement maps, with attributes initialized synthesized pseudo view-based Gaussian prior module. Then, Gaussians are mapped to the deformed world space via triangle-Gaussian transformation. In the implicit deformation component, Gaussian positions, together with a spatial semantic feature encoded by a learnable triplane and the expression, are fed into a refinement network to predict offsets, leading to final refined deformed Gaussians.

Results

Qualitative comparison between our method and comparison methods on novel-view synthesis of head avatars. Our method can reconstruct more high-frequency details.

Qualitative comparison between our method and comparison methods on self-reenactment of head avatars. Our method can exhibit more accurate deformations and finer facial details under new expressions.

Qualitative comparison between our method and comparison methods on cross-identity reenactment of head avatars. Our method can recover intricate details of driving expressions and mitigate appearance of artifacts.

Qualitative comparison between our method and comparison methods on uncommon view rendering of head avatars. Note that, as GHA incorporates super-resolution (SR), we also present results after training without the SR module, exhibiting noticeable overfitting artifacts. Additionally, we present results of our method without synthesized pseudo view-based Gaussian prior. Our method can significantly reduce overfitting artifacts.

@inproceedings{ title={Joint Gaussian Deformation in Triangle-Deformed Space for High-Fidelity Head Avatars.}, author={Lu, Jiawei and Guang, Kunxin and Hao, Conghui and Sun, Kai and Yang, Jian and Xie, Jin and Wang, Beibei}, booktitle={EGSR}, year={2025} }