project screenshot 1
project screenshot 2
project screenshot 3

Improving Face Detectors using Data initiatives

We improve the fairness and reduce bias in Face Detection algorithms by improving the quality of training datasets as a decentralized data collection initiative with incentives for user and verifiers

Improving Face Detectors using Data initiatives

Created At

ETHIndia 2022

Winner of

trophy

🏊 StackOS — Pool Prize

Project Description

Recent developments in machine learning have shown that successful models do not rely only on huge amounts of data but the right kind of data. We show in this project how this data-centric approach can be facilitated in a decentralized manner to enable efficient data collection for algorithms. Face recognition algorithms and detectors are a class of models that suffer heavily from bias issues as they have to work on a large variety of different data of different genders, age groups and ethnicities. We propose a face detection and anonymization approach using a hybrid MultiTask Cascaded CNN dockerized as a model in a container which uses Bacalhau for generating the detection boxes along with masks showing the faces in the image inputs. Our mail goal is to enrich fairness in a decentralized system of data labeling, correction, and verification by users to create a robust pipeline for model retraining. This is indeed for the public good since most datasets aren't anonymized before in the first place causing privacy and security issues. With the incentivised way of data collection, data of only underrepresented classes such as African, South-Asian ethnicity are collected in a manner of bounty fixed for collecting this data; the verifiers then verify if the uploaded data has the right annotations, ensuring the quality of datasets used do not depreciate. Finally, the models built using this dataset is indexed and stored using IPFS, ensuring that every time this model is used in an application, the revenue generated is flowing back to the ecosystem of annotators, verifiers and scientists. Thereby, also solving the problem of unbalanced and uni-class biased datasets by enriching fairness in the data collection. We try to deploy the docker container trained using StackOS to help generate the UI tool for uploading the data.

How it's Made

background image mobile

Join the mailing list

Get the latest news and updates