Privacy-preserving generative deep neural networks support clinical data sharing

Brett K. Beaulieu-Jones, Zhiwei Steven Wu, Chris Williams, Ran Lee, Sanjeev P. Bhavnani, James Brian Byrd, Casey S. Greene

Research output: Contribution to journalArticlepeer-review

25 Scopus citations

Abstract

Background: Data sharing accelerates scientific progress but sharing individual-level data while preserving patient privacy presents a barrier. Methods and Results: Using pairs of deep neural networks, we generated simulated, synthetic participants that closely resemble participants of the SPRINT trial (Systolic Blood Pressure Trial). We showed that such paired networks can be trained with differential privacy, a formal privacy framework that limits the likelihood that queries of the synthetic participants' data could identify a real a participant in the trial. Machine learning predictors built on the synthetic population generalize to the original data set. This finding suggests that the synthetic data can be shared with others, enabling them to perform hypothesis-generating analyses as though they had the original trial data. Conclusions: Deep neural networks that generate synthetic participants facilitate secondary analyses and reproducible investigation of clinical data sets by enhancing data sharing while preserving participant privacy.

Original languageEnglish (US)
Article numbere005122
JournalCirculation: Cardiovascular Quality and Outcomes
Volume12
Issue number7
DOIs
StatePublished - Jul 1 2019

Bibliographical note

Funding Information:
This study was supported by the Gordon and Betty Moore Foundation under a Data-Driven Discovery Investigator Award to Dr Greene (GBMF 4552). Dr Beau-lieu-Jones was supported by a Commonwealth Universal Research Enhancement Program grant from the Pennsylvania Department of Health and by US National Institutes of Health grants AI116794, LM010098, and T15LM007092. Dr Wu is funded in part by a subcontract on the Defense Advanced Research Projects Agency Brandeis project and a grant from the Sloan Foundation. Dr Byrd is funded by US National Institutes of Health grant K23-HL128909.

Publisher Copyright:
© 2019 Lippincott Williams and Wilkins. All rights reserved.

Keywords

  • blood pressure
  • deep learning
  • machine learning
  • privacy
  • propensity score

Fingerprint Dive into the research topics of 'Privacy-preserving generative deep neural networks support clinical data sharing'. Together they form a unique fingerprint.

Cite this