IBM Diversity in Faces Dataset
The first of its kind available to the global research community, DiF provides a dataset of annotations of 1 million human facial images.
Face recognition is a long-standing challenge in the field of Artificial Intelligence (AI). The goal is to create systems that detect, recognize, verify and understand characteristics of human faces. There are significant technical hurdles in making these systems accurate, particularly in unconstrained settings, due to confounding factors related to pose, resolution, illumination, occlusion and viewpoint. However, with recent advances in neural networks, face recognition has achieved unprecedented accuracy, built largely on data-driven deep learning methods.
To help accelerate the study of diversity and coverage of data for AI facial recognition systems, IBM Research has released a large and diverse dataset called Diversity in Faces (DiF) to advance the study of fairness and accuracy in facial recognition technology.
- 1-million images of human faces from the publicly available YFCC-100M Creative Commons dataset.
- The faces annotated using 10 well-established and independent coding schemes from the scientific literature [1-10]. The coding schemes principally include objective measures of human faces, such as craniofacial features (e.g., head length, nose length, forehead height).
- Studying diversity in faces is complex. The dataset provides a jumping off point for the global research community to further our collective knowledge.
Discover, Learn and Evaluate AI Companies and Solutions
The latest updates from AI companies in your industry
Get a weekly newsletter with the latest posts directly from the AI companies. Follow companies to tailor your feed.