GithubHelp home page GithubHelp logo

tribhuvanesh / knockoffnets Goto Github PK

View Code? Open in Web Editor NEW
88.0 7.0 30.0 63 KB

Knockoff Nets: Stealing Functionality of Black-Box Models

Home Page: https://resources.mpi-inf.mpg.de/d2/orekondy/knockoff/

License: GNU Lesser General Public License v3.0

Python 100.00%

knockoffnets's Introduction

Knockoff Nets: Stealing Functionality of Black-Box Models, CVPR '19

Tribhuvanesh Orekondy, Bernt Schiele Mario Fritz

Max Planck Institute for Informatics


Machine Learning (ML) models are increasingly deployed in the wild to perform a wide range of tasks. In this work, we ask to what extent can an adversary steal functionality of such "victim" models based solely on blackbox interactions: image in, predictions out. In contrast to prior work, we present an adversary lacking knowledge of train/test data used by the model, its internals, and semantics over model outputs. We formulate model functionality stealing as a two-step approach: (i) querying a set of input images to the blackbox model to obtain predictions; and (ii) training a "knockoff" with queried image-prediction pairs. We make multiple remarkable observations: (a) querying random images from a different distribution than that of the blackbox training data results in a well-performing knockoff; (b) this is possible even when the knockoff is represented using a different architecture; and (c) our reinforcement learning approach additionally improves query sample efficiency in certain settings and provides performance gains. We validate model functionality stealing on a range of datasets and tasks, as well as on a popular image analysis API where we create a reasonable knockoff for as little as $30.

tl;dr: We highlight a threat that functionality of blackbox models CNNs can easily 'knocked-off' under minimal assumptions

Installation

Environment

  • Python 3.6
  • Pytorch 1.1

Can be set up as:

$ conda env create -f environment.yml   # anaconda; or
$ pip install -r requirements.txt       # pip

Datasets

You will need six datasets to perform all experiments in the paper, all extracted into the data/ directory.

  • Victim datasets
    • Caltech256 (Link. Images in data/256_ObjectCategories/<classname>/*.jpg)
    • CUB-200-2011 (Link. Images in data/CUB_200_2011/images/<classname>/*.jpg)
    • Indoor Scenes (Link. Images in data/indoor/Images/<classname>/*.jpg)
    • Diabetic Retinopathy (Link. Images in data/diabetic_retinopathy/training_imgs/<classname>/*.jpg)
  • Adversarial datasets
    • ImageNet ILSVRC 2012 (Link. Images in data/ILSVRC2012/training_imgs/<classname>/*.jpg)
    • OpenImages (Link. Images in data/openimages/<classname>/*.jpg)

Attack: Overview

The commands/steps below will guide you to:

  1. Train victim models (or download pretrained models)
  2. Train knockoff models (or download pretrained models)
    1. Constructing transfer sets
    2. Training knockoff models

Victim Models

We follow the convention of storing victim models and related data (e.g., logs) under models/victim/P_V-F_V/ (e.g., cubs200-resnet34).

Option A: Download Pretrained Victim Models

Zip files (containing resnet-34 pytorch checkpoint .pth.tar, hyperparameters and training logs):

Option B: Train Victim Models

# Format:
$ python knockoff/victim/train.py DS_NAME ARCH -d DEV_ID \
        -o models/victim/VIC_DIR -e EPOCHS --pretrained
# where DS_NAME = {cubs200, caltech256, ...}, ARCH = {resnet18, vgg16, densenet161, ...}
# if the machine contains multiple GPUs, DEV_ID specifies which GPU to use

# More details:
$ python knockoff/victim/train.py --help

# Example (CUB-200):
$ python knockoff/victim/train.py CUBS200 resnet34 -d 1 \
        -o models/victim/cubs200-resnet34 -e 10 --log-interval 25 \
        --pretrained imagenet

Training Knockoff Models

We store the knockoff models and related data (e.g., transfer set, logs) under data/adversary/P_V-F_A-pi/ (e.g., cubs200-resnet50-random).

Transfer Set Construction

# Format
$ python knockoff/adversary/transfer.py random models/victim/VIC_DIR \
        --out_dir models/adversary/ADV_DIR --budget BUDGET \
        --queryset QUERY_SET --batch_size 8 -d DEV_ID
# where QUERY_SET = {ImageNet1k ,...}

# More details
$ python knockoff/adversary/transfer.py --help

# Examples (CUB-200):
# Random
$ python knockoff/adversary/transfer.py random models/victim/cubs200-resnet34 \
        --out_dir models/adversary/cubs200-resnet34-random --budget 80000 \
        --queryset ImageNet1k --batch_size 8 -d 2
# Adaptive
$ python knockoff/adversary/transfer.py adaptive models/victim/cubs200-resnet34 \
        --out_dir models/adversary/cubs200-resnet34-random --budget 80000 \
        --queryset ImageNet1k --batch_size 8 -d 2

Training Knock-offs

# Format:
$ python knockoff/adversary/train.py models/adversary/ADV_DIR ARCH DS_NAME \
        --budgets BUDGET1,BUDGET2,.. -d DEV_ID --pretrained --epochs EPOCHS \
        --lr LR
# DS_NAME refers to the dataset used to train victim model; used only to evaluate on test set during training of knockoff

# More details:
$ python knockoff/adversary/train.py --help

# Example (CUB-200)
$ python knockoff/adversary/train.py models/adversary/cubs200-resnet34-random \
        resnet34 CUBS200 --budgets 60000 -d 0 --pretrained imagenet \
        --log-interval 100 --epochs 200 --lr 0.01 

Pretrained knock-off models

Zip files (containing pytorch checkpoint, transferset pickle file, hyperparameters and logs) can be downloaded using the links below. Specifically, the knockoffs are resnet34s at B=60k using imagenet as the query set ($P_A$).

$F_V$ Random Adaptive
Caltech256 zip (76.0%) zip (%)
CUBS200 zip (67.7%) zip (%)
Indoor67 zip (68.2%) zip (%)
Diabetic5 zip (43.6%) zip (%)

Note

Since the current publicly available code uses an updated pytorch version and has been significantly refactored from the initially published version, expect minor differences in results. Please contact me (see below) in case you want the exact pretrained models used in the paper.

Citation

If you found this work or code useful, please cite us:

@inproceedings{orekondy19knockoff,
    TITLE = {Knockoff Nets: Stealing Functionality of Black-Box Models},
    AUTHOR = {Orekondy, Tribhuvanesh and Schiele, Bernt and Fritz, Mario},
    YEAR = {2019},
    BOOKTITLE = {CVPR},
}

Contact

In case of feedback, suggestions, or issues, please contact Tribhuvanesh Orekondy

knockoffnets's People

Contributors

tribhuvanesh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

knockoffnets's Issues

[invalid] Batch training is flawed

When training with different budgets in adversary/train.py, you do not reset the model each run. So if you want to train with a budget of 100 and then 200, you actually train with 100 and then 300 (100+200). Because the model has already seen the first 100 at the start of the second budget; the budget adds up.

model = zoo.get_net(model_name, modelfamily, pretrained, num_classes=num_classes)
model = model.to(device)
for b in budgets:
    #train model

Solution: Reset or redefine the model each iteration.

NotImplementedError for adaptive policy?

The adaptive policy is the main contribution of paper, why NotImplemented?

if params['policy'] == 'random':
        adversary = RandomAdversary(blackbox, queryset, batch_size=batch_size)
    elif params['policy'] == 'adaptive':
        raise NotImplementedError()
    else:
        raise ValueError("Unrecognized policy")

Budget Selection

Hi, thanks for sharing the code! my victim model is Caltech256-pretrained resnet34 as you provided and my adversary dataset is imagenet. I wonder how should I select the budget? Is 10k enough? Or I should choose 60k as CUB-200 demo? Or do you have any advice?

Thanks

Problem about 'adaptive'

Hello, I have read your paper, and I notice that the 'adaptive' strategy is interesting. But I found that you didn't implement the ‘adaptive’ strategy in your code. If you can tell me how to achieve this, I would appreciate it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.