FedHarmony: Unlearning Scanner Bias with Distributed Data

FedHarmony: Unlearning Scanner Bias with Distributed Data

MICCAI 2022 [To be presented]

Nicola Dinsdale¹ Mark Jenkinson^2,^3,⁴ Ana Namburete²

¹Oxford Machine Learning in NeuroImaging (OMNI) Lab, University of Oxford

²Wellcome Institute for Integrative NeuroImaging (WIN), University of Adelaide

³Australian Institute for Machine Learning (AIML), University of Adelaide

⁴South Australian Health and Medical Research Institute (SAHMRI)

[Paper] [GitHub]

Abstract

The ability to combine data across scanners and studies is vital for neuroimaging, to increase both statistical power and the represenation of biological variability. However, combining datasets across sites leads to two challenges: first, an increase in undesirable non-biological variance due to scanner and acquisition differences - the harmonisation problem - and second, data privacy concerns due to the inherently personal nature of medical imaging data, meaning that sharing them across sites may risk violation of privacy laws. To overcome these restrictions, we propose FedHarmony: a harmonisation framework operating in the federated learning paradigm. We show that to remove the scanner-specific effects, we only need to share the mean and standard deviation of the learned features, helping to protect individual subjects' privacy. We demonstrate our approach across a range of realistic data scenarios, using real multi-site data from the ABIDE dataset, thus showing the potential utility of our method for MRI harmonisation across studies.

Pipeline

We present an iterative framework for federated harmonisation, building upon our previous approach (Deep learning-based unlearning of dataset bias for MRI harmonisation and confound removal - NeuroImage 2021). By alternating between optimising for the main task and removing the scanner information, the network is able to perform the task of interest while being invariant to the scanner. Through modeling the features from each site as Gaussian distributions, we can remove scanner effects without moving the data, and utilise our harmonisation framework in a federated setting.

[GitHub]

Results
Age Prediction Task - Fully Supervised FedHarmony gives best average performance across sites, while most scanner information from the features.
Semisupervised The framework can also be applied in semisupervised settings, outperforming the oracle thanks to the domain adaptation approach .

Acknowledgements
This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.