Presentation Title
Degree Name
Master of Computer Science and Systems (MCSS)
Department
Institute of Technology
Location
UW Y Center
Start Date
21-5-2015 5:15 PM
End Date
21-5-2015 5:20 PM
Abstract
The predictive potential of the many large datasets being held in healthcare, financial markets, social media, etc. by separate entities is locked behind privacy constraints. These separate entities either cannot share their data with one another or it is against their interests to do so. The ability to produce powerful predictive models that leverage knowledge from these different data sources is restrained by an inability to do so without revealing the data.
In my talk, I will outline our proposed protocol in which two different entities can build one of the most popular machine learning modules, a linear regression model (a technique used throughout both industry and research communities), which leverages knowledge from both datasets without revealing either party's confidential data.
I will demonstrate how we plan to ensure protection of both parties' data throughout our protocols, building a more powerful model than either party could compute in isolation; more data leads to better models, producing benefits in many areas. A relevant example application is in healthcare: many hospitals hold a wide variety of data on patients, but privacy restrictions limit these institutions’ ability to leverage this data to build accurate predictive models that could aid in the financial and medical well-being of a patient.
We’re training models on a variety of public and private healthcare datasets to simulate this exact scenario and move toward unlocking the power behind these large datasets without compromising the privacy of individuals or the institutions that hold the data.
Private Predictive Modeling Power
UW Y Center
The predictive potential of the many large datasets being held in healthcare, financial markets, social media, etc. by separate entities is locked behind privacy constraints. These separate entities either cannot share their data with one another or it is against their interests to do so. The ability to produce powerful predictive models that leverage knowledge from these different data sources is restrained by an inability to do so without revealing the data.
In my talk, I will outline our proposed protocol in which two different entities can build one of the most popular machine learning modules, a linear regression model (a technique used throughout both industry and research communities), which leverages knowledge from both datasets without revealing either party's confidential data.
I will demonstrate how we plan to ensure protection of both parties' data throughout our protocols, building a more powerful model than either party could compute in isolation; more data leads to better models, producing benefits in many areas. A relevant example application is in healthcare: many hospitals hold a wide variety of data on patients, but privacy restrictions limit these institutions’ ability to leverage this data to build accurate predictive models that could aid in the financial and medical well-being of a patient.
We’re training models on a variety of public and private healthcare datasets to simulate this exact scenario and move toward unlocking the power behind these large datasets without compromising the privacy of individuals or the institutions that hold the data.