Using Self-Supervised Learning Can Improve Model Fairness

KDD 2024

1Nokia Bell Labs 2Aristotle University of Thessaloniki

Starting from appropriate benchmark selection, we systematically study the impact of pre-training and fine-tuning on algorithmic fairness using a new combination of evaluation and representation learning metrics.


Self-supervised learning (SSL) has emerged as the de facto training paradigm for large models, where pre-training is followed by supervised fine-tuning using domain-specific data and labels. Despite achieving comparable performance to supervised methods, comprehensive efforts to assess SSL's impact on machine learning fairness (i.e., performing equally across different demographic groups) are lacking. With the hypothesis that SSL models would learn more generic, and hence less biased representations, this work explores the impact of pre-training and fine-tuning strategies on fairness.

We introduce a fairness assessment framework for SSL, comprising five stages: defining dataset requirements, pre-training, fine-tuning with gradual unfreezing, assessing representation similarity conditioned on demographics, and establishing domain-specific evaluation processes. To evaluate our method's generalizability, we systematically compare hundreds of SSL and fine-tuned models across various dimensions, spanning from intermediate representations to appropriate evaluation metrics, on three real-world human-centric datasets (MIMIC, MESA, and GLOBEM).

Our findings demonstrate that SSL can significantly improve model fairness, while maintaining performance on par with supervised methods, exhibiting up to a 30% increase in fairness with minimal loss in performance through self-supervision. We posit that such differences can be attributed to representation dissimilarities found between the best- and the worst-performing demographics across models - up to 13 times greater for protected attributes with larger performance discrepancies between segments.

What is the optimal level of fine-tuning for fairness?

Relationship between fairness (deviation from parity) and fine-tuning strategies, as a function of model size. The purely supervised model has a greater deviation from parity, i.e., increased bias, (dashed line) compared to the best-performing fine-tuned model (i.e., 1 •◦•) that has been pre-trained before. The observed “U-shape” patterns in MIMIC and GLOBEM datasets suggest an optimal level of fine-tuning.

Do we trade off accuracy for fairness?

AUC-ROC curves across datasets and fine-tuning strategies. The purely supervised models show superior performance, but are closely followed by fine-tuned ones, e.g., 1 (•◦•). The level of post-SSL fine-tuning greatly affects the observed performance.

SSL is fairer with minority segments

The left figure shows the relationship between segment size and performance (AUC-ROC) across datasets. The smaller the segment, the larger the performance discrepancies. Fitted lowess curves show that SSL lies closer to the “fair” (dashed) line.

SSL is fairer when tuned with more labels

The right figure shows fairness as a function of fine-tuning labelled data in the MIMIC dataset. The SSL model achieves higher fairness in the low-label regime as well as with access to more labels for fine-tuning.

Conditioned Representation Similarity for layer-level comparisons across subgroups

We developed "Conditioned Representation Similarity" to compare subgroups on the representation/layer level by extending Centered Kernel Alignment (CKA). Here we show results from MIMIC comparing the best supervised and fine-tuned SSL models, with every figure corresponding to language, gender, race, and insurance groups (please click the arrows above). The first heatmap shows a balanced random subset, while the second and third show the worst and best groups respectively, followed by their delta.

Across the board, for the best-performing groups, both the SSL and supervised models not only excel in performance but also exhibit strikingly similar representations. On the other hand, for the worst-performing groups, the representations diverge notably. The higher the performance gap between segments, the larger the representation gap. To paraphrase Tolstoy's Anna Karenina principle (All happy families are alike; each unhappy family is unhappy in its own way): Representations of accurate subgroups are alike; each underperforming one is different in its own way.


The focus of this work is on evaluating how design choices in SSL impact fairness, rather than proposing new bias mitigation algorithms. However, our SSL framework parallels implicit bias mitigation methods. For instance, the pre-training phase acts akin to pre-processing, practically removing discriminatory associations by learning from unlabeled data that are not paired with outcomes. The subsequent fine-tuning phase operates like an in-processing method, controlling the regularization effect on the model's accuracy. However, it is essential to acknowledge that SSL alone may not eliminate all disparities, especially when trained on poor-quality or biased data.


      title =      {Using Self-supervised Learning can Improve Model Fairness},
      author =     {Sofia Yfantidou, Dimitris Spathis, Marios Constantinides, Athena Vakali, Daniele Quercia, Fahim Kawsar},
      booktitle =  {Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
      year =       {2024}