Model Inversion Attacks that Exploit Conﬁdence …

Model Inversion Attacks that Exploit Confidence Informationand Basic CountermeasuresMatt FredriksonCarnegie Mellon UniversitySomesh JhaUniversity of Wisconsin MadisonThomas RistenpartCornell TechABSTRACTM achine-learning (ML) algorithms are increasingly utilizedin privacy-sensitive applications such as predicting lifestylechoices, making medical diagnoses, and facial recognition. Ina Model Inversion attack, recently introduced in a case studyof linear classifiers in personalized medicine by Fredriksonet al.

[13], adversarial access to an ML Model is abusedto learn sensitive genomic information about Model Inversion Attacks apply to settings outsidetheirs, however, is develop a new class of Model Inversion attack thatexploits confidence values revealed along with new Attacks are applicable in a variety of settings, andwe explore two in depth: decision trees for lifestyle surveysas used on machine-learning-as-a-service systems and neuralnetworks for facial recognition. In both cases confidence val-ues are revealed to those with the ability to make predictionqueries to models.

We experimentally show Attacks that areable to estimate whether a respondent in a lifestyle surveyadmitted to cheating on their significant other and, in theother context, show how to recover recognizable images ofpeople s faces given only their name and access to the MLmodel. We also initiate experimental exploration of naturalcountermeasures, investigating a privacy-aware decision treetraining algorithm that is a simple variant of CART learn-ing, as well as revealing only rounded confidence values.

Thelesson that emerges is that one can avoid these kinds of MIattacks with negligible degradation to INTRODUCTIONC omputing systems increasingly incorporate machine learn-ing (ML) algorithms in order to provide predictions of lifestylechoices [6], medical diagnoses [20], facial recognition [1],and more. The need for easy push-button ML has evenprompted a number of companies to build ML-as-a-servicecloud systems, wherein customers can upload data sets, trainclassifiers or regression models.

And then obtain access toperform prediction queries using the trained Model allPermission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than theauthor(s) must be honored. Abstracting with credit is permitted. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee.

Request permissions from 15,October 12 16, 2015, Denver, Colorado, is held by the owner/author(s). Publication rights licensed to 978-1-4503-3832-5/15/10 ..$ : .over easy-to-use public HTTP interfaces. The features usedby these models, and queried via APIs to make predictions,often represent sensitive information. In facial recognition,the features are the individual pixels of a picture of a per-son s face. In lifestyle surveys, features may contain sensitiveinformation, such as the sexual habits of the context of these services, a clear threat is thatproviders might be poor stewards of sensitive data, allow-ing training data or query logs to fall prey to insider at-tacks or exposure via system compromises.

A number ofworks have focused on Attacks that result from access to(even anonymized) data [18,29,32,38]. A perhaps more sub-tle concern is that the ability to make prediction queriesmight enable adversarialclientsto back out sensitive work by Fredrikson et al. [13] in the context of ge-nomic privacy shows amodel Inversion attackthat is ableto use black-box access to prediction models in order to es-timate aspects of someone s genotype. Their attack worksfor any setting in which the sensitive feature being inferredis drawn from a small set.

They only evaluated it in a singlesetting, and it remains unclear if Inversion Attacks pose abroader this paper we investigate commercial ML-as-a-serviceAPIs. We start by showing that the Fredrikson et al. at-tack, even when it is computationally tractable to mount, isnot particularly effective in our new settings. We thereforeintroduce new Attacks that infer sensitive features used asinputs to decision tree models, as well as Attacks that re-cover images from API access to facial recognition key enabling insight across both situations is that wecan build attack algorithms that Exploit confidence valuesexposed by the APIs.

One example from our facial recogni-tion Attacks is depicted in Figure 1: an attacker can producea recognizable image of a person, given only API access to afacial recognition system and the name of the person whoseface is recognized by APIs and Model provide an overviewof contemporary ML services in Section 2, but for the pur-poses of discussion here we roughly classify client-side accessas being eitherblack-boxorwhite-box. In a black-box setting,an adversarial client can make prediction queries against amodel, but not actually download the Model a white-box setting, clients are allowed to download adescription of the Model .

The new generation of ML-as-a-service systems including general-purpose ones such asBigML [4] and Microsoft Azure Learning [31] allow dataowners to specify whether APIs should allow white-box orblack-box access to their 1: An image recovered using a new Model in-version attack (left) and a training set image of thevictim (right). The attacker is given only the per-son s name and access to a facial recognition systemthat returns a class confidence a Model defining a functionfthat takes input afeature vectorx1.

Model Inversion Attacks that Exploit Conﬁdence …

Tags:

Information

Advertisement

Transcription of Model Inversion Attacks that Exploit Conﬁdence …

Related search queries

Model Inversion Attacks that Exploit Conﬁdence …

Tags:

Information

Advertisement

Documents from same domain

Related documents

Related search queries