Abstract
Accurate head pose estimation from 2D image data is an essential component of applications
such as driver monitoring systems, virtual reality technology, and human-computer interaction. It enables
a better determination of user engagement and attentiveness. The most accurate head pose estimators
are based on Deep Neural Networks that are trained with the supervised approach and rely primarily
on the accuracy of training data. The acquisition of real head pose data with a wide variation of yaw,
pitch and roll is a challenging task. Publicly available head pose datasets have limitations with respect to
size, resolution, annotation accuracy and diversity. In this work, a methodology is proposed to generate
pixel-perfect synthetic 2D headshot images rendered from high-quality 3D synthetic facial models with
accurate head pose annotations. A diverse range of variations in age, race, and gender are also provided. The
resulting dataset includes more than 300k pairs of RGB images with corresponding head pose annotations.
A wide range of variations in pose, illumination and background are included. The dataset is evaluated
by training a state-of-the-art head pose estimation model and testing against the popular evaluation-dataset
Biwi. The results show that training with purely synthetic data generated using the proposed methodology
achieves close to state-of-the-art results on head pose estimation which are originally trained on real human
facial datasets. As there is a domain gap between the synthetic images and real-world images in the feature
space, initial experimental results fall short of the current state-of-the-art. To reduce the domain gap, a semisupervised visual domain adaptation approach is proposed, which simultaneously trains with the labelled
synthetic data and the unlabeled real data. When domain adaptation is applied, a significant improvement in
model performance is achieved. Additionally, by applying a data fusion-based transfer learning approach,
better results are achieved than previously published work on this topic
| Original language | English (Ireland) |
|---|---|
| Pages (from-to) | 37557-37573 |
| Number of pages | 17 |
| Journal | IEEE Access |
| Volume | 9 |
| DOIs | |
| Publication status | Published - 1 Mar 2021 |
Keywords
- Face dataset
- Head pose estimation
- Synthetic face
- Visual domain adaptation
Authors (Note for portal: view the doc link for the full list of authors)
- Authors
- Shubhajit Basak, Peter Corcoran, Faisal Khan, Rachel McDonnell, Michael Schukat