Identity-Preserving Face Swapping via Dual Surrogate Generative Models

¹Institute of Computing Technology, Chinese Academy of Sciences, China
²AI Lab, Tencent, China
³National Cheng Kung University, Taiwan

Abstract

In this study, we revisit the fundamental setting of face-swapping models and reveal that only using implicit supervision for training leads to the difficulty of advanced methods to preserve the source identity. We propose a novel reverse pseudo-input generation approach to offer supplemental data for training face-swapping models, which addresses the aforementioned issue. Unlike the traditional pseudo-label-based training strategy, we assume that arbitrary real facial images could serve as the ground-truth outputs for the face-swapping network and try to generate corresponding input <source, target> pair data. Specifically, we involve a source-creating surrogate that alters the attributes of the real image while keeping the identity, and a target-creating surrogate intends to synthesize attribute-preserved target images with different identities. Our framework, which utilizes proxy-paired data as explicit supervision to direct the face-swapping training process, partially fulfills a credible and effective optimization direction to boost the identity-preserving capability. We design explicit and implicit adaption strategies to better approximate the explicit supervision for face swapping. Quantitative and qualitative experiments on FF++, FFHQ, and wild images show that our framework could improve the performance of various face-swapping pipelines in terms of visual fidelity and ID preserving. Furthermore, we display applications with our method on re-aging, swappable attribute customization, cross-domain, and video face swapping.

Method

Comparison between different swapping training frameworks. (a) Face-swapping training with ground-truth swapped image $y_{gt}$; (b) Current swapping training with ID loss and reconstruction loss; (c) Our CSCS synthesis proxy paired data by dual-creating surrogates from one real image, and adaptation is operated to get the required training data. Credible supervision is applied by $L_{rec}$ between $y_{out}$ and $y_{gt}$ where a real image is utilized to guide swapping. The reserve design of dual surrogates prevents error accumulation from the multi-step synthesis process.

BibTeX

@article{cscs, title={Identity-Preserving Face Swapping via Dual Surrogate Generative Models}, author={Huang, Ziyao and Tang, Fan and Zhang, Yong and Cao, Juan and Li, Chengyu and Tang, Sheng and Li, Jintao and Lee, Tong-Yee}, journal={ACM Transactions on Graphics (ToG)}, }

Identity-Preserving Face Swapping via Dual Surrogate Generative Models

Abstract

Method

Image Results

Comparisons on FF++

Comparisons on More Data

Swappable Attribute Customization

Cross-domain Results

Re-aging

Video Results

BibTeX