Using Self-Supervised Learning can Improve Model Fairness

Abstract

Self-supervised learning (SSL) has emerged as the de facto training paradigm for large models, where pre-training is followed by supervised fine-tuning using domain-specific data and labels. Despite achieving comparable performance to supervised methods, comprehensive efforts to assess SSL's impact on machine learning fairness (i.e., performing equally across different demographic groups) are lacking. With the hypothesis that SSL models would learn more generic, and hence less biased representations, this work explores the impact of pre-training and fine-tuning strategies on fairness.

We introduce a fairness assessment framework for SSL, comprising five stages: defining dataset requirements, pre-training, fine-tuning with gradual unfreezing, assessing representation similarity conditioned on demographics, and establishing domain-specific evaluation processes. To evaluate our method's generalizability, we systematically compare hundreds of SSL and fine-tuned models across various dimensions, spanning from intermediate representations to appropriate evaluation metrics, on three real-world human-centric datasets (MIMIC, MESA, and GLOBEM).

Our findings demonstrate that SSL can significantly improve model fairness, while maintaining performance on par with supervised methods, exhibiting up to a 30% increase in fairness with minimal loss in performance through self-supervision. We posit that such differences can be attributed to representation dissimilarities found between the best- and the worst-performing demographics across models - up to 13 times greater for protected attributes with larger performance discrepancies between segments.

What is the optimal level of fine-tuning for fairness?

Relationship between fairness (deviation from parity) and fine-tuning strategies, as a function of model size. The purely supervised model has a greater deviation from parity, i.e., increased bias, (dashed line) compared to the best-performing fine-tuned model (i.e., 1 •◦•) that has been pre-trained before. The observed “U-shape” patterns in MIMIC and GLOBEM datasets suggest an optimal level of fine-tuning.

Do we trade off accuracy for fairness?

AUC-ROC curves across datasets and fine-tuning strategies. The purely supervised models show superior performance, but are closely followed by fine-tuned ones, e.g., 1 (•◦•). The level of post-SSL fine-tuning greatly affects the observed performance.

Conditioned Representation Similarity for layer-level comparisons across subgroups

Language performance gap (b-c) = 1%

Gender performance gap (f-g) = 3%

Race performance gap (j-k) = 18%

Insurance performance gap (n-o) = 16%

We developed "Conditioned Representation Similarity" to compare subgroups on the representation/layer level by extending Centered Kernel Alignment (CKA). Here we show results from MIMIC comparing the best supervised and fine-tuned SSL models, with every figure corresponding to language, gender, race, and insurance groups (please click the arrows above). The first heatmap shows a balanced random subset, while the second and third show the worst and best groups respectively, followed by their delta.

Across the board, for the best-performing groups, both the SSL and supervised models not only excel in performance but also exhibit strikingly similar representations. On the other hand, for the worst-performing groups, the representations diverge notably. The higher the performance gap between segments, the larger the representation gap. To paraphrase Tolstoy's Anna Karenina principle (All happy families are alike; each unhappy family is unhappy in its own way): Representations of accurate subgroups are alike; each underperforming one is different in its own way.

Discussion

The focus of this work is on evaluating how design choices in SSL impact fairness, rather than proposing new bias mitigation algorithms. However, our SSL framework parallels implicit bias mitigation methods. For instance, the pre-training phase acts akin to pre-processing, practically removing discriminatory associations by learning from unlabeled data that are not paired with outcomes. The subsequent fine-tuning phase operates like an in-processing method, controlling the regularization effect on the model's accuracy. However, it is essential to acknowledge that SSL alone may not eliminate all disparities, especially when trained on poor-quality or biased data.

BibTeX

@inproceedings{yfantidou24kdd,
      title =      {Using Self-supervised Learning can Improve Model Fairness},
      author =     {Sofia Yfantidou, Dimitris Spathis, Marios Constantinides, Athena Vakali, Daniele Quercia, Fahim Kawsar},
      booktitle =  {Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
      year =       {2024}
      }

Using Self-Supervised Learning Can Improve Model Fairness

KDD 2024

Starting from appropriate benchmark selection, we systematically study the impact of pre-training and fine-tuning on algorithmic fairness using a new combination of evaluation and representation learning metrics.