Skip to main navigation Skip to search Skip to main content

Dataset Cleaning - A Cross Validation Methodology for Large Facial Datasets using Face Recognition

  • University of Galway

Research output: Chapter in Book or Conference Publication/ProceedingConference Publicationpeer-review

5 Citations (Scopus)

Abstract

In recent years, large 'in the wild' face datasets have been released in an attempt to facilitate progress in tasks such as face detection, face recognition, and other tasks. Most of these datasets are acquired from webpages with automatic procedures. As a consequence, noisy data are often found. Furthermore, in these large face datasets, the annotation of identities is important as they are used for training face recognition algorithms. But due to the automatic way of gathering these datasets and due to their large size, many identities folder contain mislabeled samples which deteriorates the quality of the datasets. In this work, it is presented a semi-automatic method for cleaning the noisy large face datasets with the use of face recognition. This methodology is applied to clean the CelebA dataset and show its effectiveness. Furthermore, the list with the mislabelled samples in the CelebA dataset is made available.

Original languageEnglish
Title of host publication2020 12th International Conference on Quality of Multimedia Experience, QoMEX 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728159652
DOIs
Publication statusPublished - May 2020
Event12th International Conference on Quality of Multimedia Experience, QoMEX 2020 - Athlone, Ireland
Duration: 26 May 202028 May 2020

Publication series

Name2020 12th International Conference on Quality of Multimedia Experience, QoMEX 2020

Conference

Conference12th International Conference on Quality of Multimedia Experience, QoMEX 2020
Country/TerritoryIreland
CityAthlone
Period26/05/2028/05/20

Keywords

  • CelebA
  • clean face dataset
  • face datasets
  • mislabeled identities
  • noisy samples
  • semi-automatic cleaning

Fingerprint

Dive into the research topics of 'Dataset Cleaning - A Cross Validation Methodology for Large Facial Datasets using Face Recognition'. Together they form a unique fingerprint.

Cite this