SFHarmony: Source Free Domain Adaptation for Distributed Neuroimaging Analysis

ICCV 2023
Nicola Dinsdale1
Mark Jenkinson2,3,4
Ana Namburete1,2

1Oxford Machine Learning in NeuroImaging (OMNI) Lab, University of Oxford
2Wellcome Institute for Integrative NeuroImaging (WIN), University of Adelaide
3Australian Institute for Machine Learning (AIML), University of Adelaide
4South Australian Health and Medical Research Institute (SAHMRI)
[Paper]
[GitHub]

Abstract

To represent the biological variability of clinical neuroimaging populations, it is vital to be able to combine data across scanners and studies. However, different MRI scanners produce images with different characteristics, resulting in a domain shift known as the `harmonisation problem'. Additionally, neuroimaging data is inherently personal in nature, leading to data privacy concerns when sharing the data. To overcome these barriers, we propose an Unsupervised Source-Free Domain Adaptation (SFDA) method, SFHarmony. Through modelling the imaging features as a Gaussian Mixture Model and minimising an adapted Bhattacharyya distance between the source and target features, we can create a model that performs well for the target data whilst having a shared feature representation across the data domains, without needing access to the source data for adaptation or target labels. We demonstrate the performance of our method on simulated and real domain shifts, showing that the approach is applicable to classification, segmentation and regression tasks, requiring no changes to the algorithm. Our method outperforms existing SFDA approaches across a range of realistic data scenarios, demonstrating the potential utility of our approach for MRI harmonisation and general SFDA problems.

SFHarmony

We explore an unsupervised DA setting where only the source model, instead of the source data, is pro- vided to the unlabelled target domain for harmonisation, known as Source Free Domain Adaptation (SFDA). This setting inherently protects individual privacy, whilst allow- ing the efficient incorporation of new sites without requiring target labels. We propose a simple yet effective solution, termed SFHarmony, which aims to match feature embed- dings from the source and target, through characterising the embeddings as a Gaussian Mixture model (GMM) and the use of a modified Bhattacharyya distance. We demonstrate the approach for classification, segmentation and regression tasks.
Our contributions are as follows:
- We propose a new method for SFDA, SFHarmony, based on aligning feature embeddings, utilising a modified Bhattacharyya distance, requiring no changes to source training;
- We demonstrate the method’s applicability to classification, segmentation and regression tasks, and show that the approach outperforms existing SFDA methods for domain shifts experienced when working with neuroimaging data;
- We demonstrate the robustness of the method to additional challenges likely to be faced when working with real world imaging data: differential privacy and label imbalance.

 [GitHub]

Results

Classification

We applied synthetic shifts to the OrganMNist dataset to simulate shifts expected when working with MRI data. We compared to state of the art methods, and showed our approach outperformed existing methods for a range of batchsizes.


We also considered two additional challenges: differential privacy and label imbalance. We found that our approach was robust to both.

Segmentation
We considered two segmentation tasks: brain extraction (CC359) and tissue segmentation (ABIDE) using multisite datasets. We outperformed the existing methods for segmentation, including domain specific approaches.
Regression
We considered age prediction with the ABIDE data for the regression task. No existing SFDA approaches can be used for regression.


Conclusion

We have presented SFHarmony, a method for SFDA, motivated by the need to harmonise MRI data across imaging sites while relaxing assumptions about the availability of source data. We have demonstrated the applicability of the method to classification, regression, and segmentation tasks, and have shown that it outperforms existing SFDA approaches when applied to MR imaging data. The approach is general, allowing it to be applied across architectures and tasks. Issues may arise due the increase in features when applying the approach to 3D volumes. Currently, the approach models each feature as an independent GMM, but features will be highly related within a filter and approaches to utilise these relations should be explored.

Acknowledgements

ND is supported by a Academy of Medical Sciences Springboard Award. MJ is supported by the National Institute for Health Research, Oxford Biomedical Research Centre, and this research was funded by the Wellcome Trust [215573/Z/19/Z]. WIN is supported by core funding from the Wellcome Trust [203139/Z/16/Z]. AN is grateful for support from the Academy of Medical Sciences under the Springboard Awards scheme (SBF005/1136), and the Bill and Melinda Gates Foundation. This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.