A Model Behaviour Mutation Approach to Benchmarking Bias Mitigation Methods
The increasingly wide uptake of Machine Learning (ML) has raised
the significance of the problem of tackling bias (i.e., unfairness),
making it a primary software engineering concern. In this paper, we
introduce Fairea, a model behaviour mutation approach to benchmarking ML bias mitigation methods. We also report on a largescale empirical study to test the effectiveness of 12 widely-studied
bias mitigation methods. Our results reveal that, surprisingly, bias
mitigation methods have a poor effectiveness in 49% of the cases. In
particular, 15% of the mitigation cases have worse fairness-accuracy
trade-offs than the baseline established by Fairea; 34% of the cases
have a decrease in accuracy and an increase in bias.
Fairea has been made publicly available for software engineers
and researchers to evaluate their bias mitigation methods.