uptake / autofocus Goto Github PK

View Code? Open in Web Editor NEW

54.0 12.0 30.0 99.49 MB

Deep learning computer vision for classifying wildlife in camera trap images

License: BSD 3-Clause "New" or "Revised" License

Python 1.27% Dockerfile 0.03% Shell 0.04% R 0.17% Jupyter Notebook 98.48%

computer-vision deep-learning fastai pytorch conservation-bio camera-traps

autofocus's Introduction

Autofocus

THIS PROJECT IS INACTIVE, AND ITS APP AND DATASETS ARE NO LONGER AVAILABLE.

This project uses deep learning computer vision to label images taken by motion-activated "camera traps" according to the animals they contain. Accurate models for this labeling task can address a major bottleneck for wildlife conservation efforts.

Getting the App

If you just want to get labels for your images, you can use the following steps to run a service that passes images through a trained model.

Make sure Docker is installed and running.
~~Run docker pull gsganden/autofocus_serve:1.2.3 to download the app image. (Note that it takes up about 4GB of disk space.)~~ This dataset is no longer available.
Run docker run -p 8000:8000 gsganden/autofocus_serve:1.2.3 to start the app.
Make POST requests against the app to get predictions.

For instance, with the base of this repo as the working directory you can send the image fawn.JPG to the app with this command:

curl -F "file=@./gallery/fawn.JPG" -X POST http://localhost:8000/predict

Or send the zipped gallery directory to the app with this command:

curl -F "[email protected]" -X POST http://localhost:8000/predict_zip

See autofocus/predict/example_post.py and autofocus/predict/example_post.R for example scripts that make requests using Python and R, respectively.

For a single image, the app will respond with a JSON object that indicates the model's probability that the image contains an animal in each of the categories that it has been trained on. For instance, it might give the following response for an image containing raccoons:

{
  "beaver": 7.996849172335282e-16,
  "bird": 6.235780460883689e-07,
  "cat": 9.127776934292342e-07,
  "chipmunk": 4.231552441780195e-09,
  "coyote": 2.1184381694183685e-05,
  "deer": 3.6601684314518934e-06,
  "dog": 1.4745426142326323e-06,
  "empty": 0.0026697132270783186,
  "fox": 2.7905798602890358e-14,
  "human": 1.064212392520858e-05,
  "mink": 2.7622977689933936e-13,
  "mouse": 4.847318102463305e-09,
  "muskrat": 6.164089044252078e-16,
  "opossum": 9.763967682374641e-05,
  "rabbit": 2.873173616535496e-05,
  "raccoon": 0.9986177682876587,
  "rat": 4.3888848111350853e-10,
  "skunk": 4.078452775502228e-07,
  "squirrel": 1.2888597211713204e-06,
  "unknown": 0.0004612557531800121,
  "woodchuck": 1.2980818033154779e-14
}

The model generates each of these probabilities separately to allow for the possibility e.g. that an image contains both a human and a dog, so they will not sum to 1 in general.

The /predict_zip endpoint returns a JSON object mapping file paths to model probabilities formatted as above.

During development, it is convenient to run the app in debug mode with the local directory mounted to the Docker container so that changes you make locally are reflected in the service immediately:

docker run \
    -it \
    -v "${PWD}/autofocus/predict:/image_api" \
    -p 8000:8000 \
    gsganden/autofocus_serve python app/app.py

Getting the Model

The app described above uses a multilabel fast.ai model. You can download that model directly with the following command. This command was written to run from the repo root.

This dataset is no longer available.

autofocus/train_model/train_multilabel_model.ipynb contains the code that was used to train and evaluate this model.

Getting the Data

The model described above was trained on a set of images provided by the Lincoln Park Zoo's Urban Wildlife Institute that were taken in the Chicago area in mid-summer 2016 and 2017. If you wish to train your own model, you can use the instructions below to download that dataset and other related datasets.

If necessary, create an AWS account, install the AWS CLI tool (pip install awscli), and set up your AWS config and credentials (aws configure). All of the commands below are written to run from the repo root.

Use this commend to download a preprocessed version of the Lincoln Park Zoo 2016-2017 dataset to autofocus/data/ (you can change the destination directory if you like):

No longer available

Unpack the tarfile:

mkdir $(pwd)/data/lpz_2016_2017/
tar -xvf $(pwd)/data/${FILENAME} -C $(pwd)/data/lpz_2016_2017/

Delete the tarfile:

rm $(pwd)/data/${FILENAME}

This dataset contains approximately 80,000 images and a CSV of labels and image metadata. It occupies 17.1GB uncompressed, so you will need about 40GB free for the downloading and untarring process. The images have been preprocessed by trimming the bottom 198 pixels (which often contains a metadata footer that could only mislead a machine learning model) and resizing to be 512 pixels along their shorter dimension. In addition, the labels have been cleaned up and organized.

If you would like to work with data that has not been preprocessed as described above, replace FILENAME=lpz_2016_2017_processed.tar.gz with FILENAME=data_2016_2017.tar.gz. You will need to have about 100GB free to download and untar the raw data. This dataset is no longer available. autofocus/build_dataset/lpz_2016_2017/process_raw.py contains the code that was used to generate the processed data from the raw data.

A second dataset from the Lincoln Park Zoo's Urban Wildlife Institute contains approximately 75,000 images (227 x 227 pixels) and a CSV of labels and image metadata from the Chicago area in 2012-2014. It takes up 7.9GB uncompressed. To get this data, follow the same steps as for the 2016-2017 dataset, but replace FILENAME=lpz_2016_2017_processed.tar.gz with FILENAME=lpz_2012-2014.tar.gz, and use this command to unpack the tarfile. This dataset is no longer available.

tar -xvf $(pwd)/data/${FILENAME} -C $(pwd)/data/

A third dataset from the Lincoln Park Zoo's Urban Wildlife Institute contains unlabeled three-image bursts from 2018. It takes up 5.7GB uncompressed. ~~To get this data, follow the same steps as for the 2012-2014 dataset, but replace FILENAME=lpz_2016_2017_processed.tar.gz with FILENAME=lpz_2018.tar.gz.~~ This dataset is no longer available.

Running Tests

To test the app, run pip install -r requirements-dev.txt and then pytest. The tests assume that the app is running locally on port 8000 according to the instructions above.``

Example Images

autofocus's People

Contributors

Stargazers

Watchers

autofocus's Issues

Log git diff instead of source file

mlflow is currently logging the entire retrain.py file. It would be sufficient instead to log the git hash for the most recent commit and the diff relative to that commit.

Support multitask learning

Train many binary classifiers with shared weights rather than one multiclass classifier.

Support one-shot learning by training on image triplets

A la https://www.coursera.org/learn/convolutional-neural-networks/lecture/HuUtN/triplet-loss applied to animal kind recognition rather than facial recognition. Could improve ability to generalize to a new location.

Address challenges raised by object of interest occupying a small part of the image

The fastai deep learning course has some material on this topic in lesson 7 of the part 1 and in part 2. There are techniques that work better than simply retraining an imagenet-pretrained network.

Provide ability to view metrics by location

CI is not setup

Should set up Travis CI to do tests on PRs and merges to master.

Add LICENSE

Incorporate a background subtraction step

E.g. cluster empty images, and for each new image identify the closest cluster and apply standard background detection techniques. Could use as a preprocessing step for a classifier or to highlight areas for manual inspection in a GUI.

Host some sample data in a publicly-accessible place

I think it would be worthwhile to host some image data relevant to this project in a public place so people can run the code against a real (even if small) dataset

Try fastai library

retrain.py follows Google convention of two-space indents whereas rest of repo uses four-space indents

Document API use

Use semi-adversarial learning to obscure location and/or time of day

See https://arxiv.org/abs/1712.00321. Has been used to promote algorithmic fairness by obscuring sensitive demographic information, but it seems applicable as well to obscuring cues that won't generalize, such as information about the location of the image.

Log train as well as test metrics

Identify potential animals and crop to them before applying classifier

Rather than providing entire images to a classifier, it might work better to identify areas of interest as a first step and apply the classifier within those areas.

Use mlflow to log training time

Provide option to split by time for testing and validation

Splitting without regard to time can lead to data leakage and give upwardly biased estimate of how the model will perform in the future. Should use timestamps to avoid splitting within a sequence of images that are close together in time, which might show the same animals in only slightly different positions and poses.

Add description in README

It's hard to tell what this project actually does when landing on the frontpage. Ideally, the first paragraph should explain what the package does.

Data not available

I am working to get a data set released for use with this code.

Experiment with decreasing learning rates as you go further back into the network when fine-tuning

See #2 in https://blog.floydhub.com/ten-techniques-from-fast-ai/

Make a CONTRIBUTING.md and/or PR checklist template

Specify e.g. flake8

Use mlflow to log labelwise precision-recall curves

The project has no description or tags

Should add description and tags to the repo to help people discover it and help those arriving here from links understand what it does.

Add labels to issues

Support image preprocessing (e.g. histogram normalization)

Try off-the-shelf OpenCV pedestrian detector

See how well it works as a classifier on its own, as an input to a deep learning algo, and in an ensemble with a deep learning algo

Support using LIME (or similar) to see to what extent the model is paying attention to animals vs. other cues such as night/day and the background.

Generalize data-cleaning code

Build up image sizes

See Section 9 of https://blog.floydhub.com/ten-techniques-from-fast-ai/

Support restarts in SGD

See #5 in https://blog.floydhub.com/ten-techniques-from-fast-ai/

Provide tool to inspect misclassified test images

retrain.py can report misclassified test images. It would be nice to have a convenient way to inspect them to get insights into how the model is working.

Provide option to log the trained models

Should add a code of conduct

Something like https://github.com/UptakeOpenSource/uptasticsearch/blob/master/CONDUCT.md detailing community standards for contributors.

Use mlflow to log memory use

Provide option to split by location for validation and testing

Splitting by image can lead to data leakage and will give an inflated estimate of how well a model would perform when applied to a new location.

Try pretraining on Zamba dataset

We are generally using models pretrained on ImageNet. We could use models trained on https://github.com/drivendataorg/zamba (another camera traps project) either instead of those ImageNet-trained models or as an intermediate step after the ImageNet training and before training on our own images. It's plausible that this approach could help because Zamba data is more similar to ours in some way than ImageNet images. One challenge is that the Zamba data takes the form of video clips rather than single images.

Do more augmentations

Given that the animals often occupy only a small part of the image, we may get better performance with a lot of augmentations involving high zoom. This kind of approach is often used with medical images, e.g. where a tumor might occupy a small part of the image.

Look at clusters in feature space

See https://petewarden.com/2018/05/28/why-you-need-to-improve-your-training-data-and-how-to-do-it/

Provide simple, clean notebook to help people get started working with the data

Brad and I were trying to figure this out to no avail. Please provide a jupyter notebook with some examples. And a free repo with some sample images for training

Support experimenting with learning rate

E.g. try the approach described in Section 3 of https://blog.floydhub.com/ten-techniques-from-fast-ai/

Use mlflow to log whether a GPU was used

Support cosine annealing

See Section 4 in https://blog.floydhub.com/ten-techniques-from-fast-ai/

Create centralized location for mlflow logs

If multiple people are experimenting with model training runs, it would be great to be able to see their run parameters and results in one centralized place.

Try converting convolutional kernels of pretrained network to grayscale and applying to night images

Camera trap images taken at night use infrared light to create a grayscale image. CNNs pretrained on ImageNet may not handle these images well. One possible approach is to convert the convolutional kernels in one of these networks to grayscale and then finetune on the night images (using a separate network for daytime images).

If straight conversion to grayscale doesn't work, we might try style transfer.

I am not aware of research on applying ImageNet-pretrained models to grayscale or more specifically IR images, so I am not sure that this approach is best.

uptake / autofocus Goto Github PK

autofocus's Introduction

Autofocus

Further Reading

Getting the App

Getting the Model

Getting the Data

Running Tests

Example Images

autofocus's People

Contributors

Stargazers

Watchers

Forkers

autofocus's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs