GithubHelp home page GithubHelp logo

uptake / autofocus Goto Github PK

View Code? Open in Web Editor NEW
54.0 12.0 30.0 99.49 MB

Deep learning computer vision for classifying wildlife in camera trap images

License: BSD 3-Clause "New" or "Revised" License

Python 1.27% Dockerfile 0.03% Shell 0.04% R 0.17% Jupyter Notebook 98.48%
computer-vision deep-learning fastai pytorch conservation-bio camera-traps

autofocus's Introduction

Travis Build Status

Autofocus

coyote

THIS PROJECT IS INACTIVE, AND ITS APP AND DATASETS ARE NO LONGER AVAILABLE.

This project uses deep learning computer vision to label images taken by motion-activated "camera traps" according to the animals they contain. Accurate models for this labeling task can address a major bottleneck for wildlife conservation efforts.

Further Reading

Getting the App

If you just want to get labels for your images, you can use the following steps to run a service that passes images through a trained model.

  1. Make sure Docker is installed and running.
  2. Run docker pull gsganden/autofocus_serve:1.2.3 to download the app image. (Note that it takes up about 4GB of disk space.) This dataset is no longer available.
  3. Run docker run -p 8000:8000 gsganden/autofocus_serve:1.2.3 to start the app.
  4. Make POST requests against the app to get predictions.

For instance, with the base of this repo as the working directory you can send the image fawn.JPG to the app with this command:

curl -F "file=@./gallery/fawn.JPG" -X POST http://localhost:8000/predict

Or send the zipped gallery directory to the app with this command:

curl -F "[email protected]" -X POST http://localhost:8000/predict_zip

See autofocus/predict/example_post.py and autofocus/predict/example_post.R for example scripts that make requests using Python and R, respectively.

For a single image, the app will respond with a JSON object that indicates the model's probability that the image contains an animal in each of the categories that it has been trained on. For instance, it might give the following response for an image containing raccoons:

{
  "beaver": 7.996849172335282e-16,
  "bird": 6.235780460883689e-07,
  "cat": 9.127776934292342e-07,
  "chipmunk": 4.231552441780195e-09,
  "coyote": 2.1184381694183685e-05,
  "deer": 3.6601684314518934e-06,
  "dog": 1.4745426142326323e-06,
  "empty": 0.0026697132270783186,
  "fox": 2.7905798602890358e-14,
  "human": 1.064212392520858e-05,
  "mink": 2.7622977689933936e-13,
  "mouse": 4.847318102463305e-09,
  "muskrat": 6.164089044252078e-16,
  "opossum": 9.763967682374641e-05,
  "rabbit": 2.873173616535496e-05,
  "raccoon": 0.9986177682876587,
  "rat": 4.3888848111350853e-10,
  "skunk": 4.078452775502228e-07,
  "squirrel": 1.2888597211713204e-06,
  "unknown": 0.0004612557531800121,
  "woodchuck": 1.2980818033154779e-14
}

The model generates each of these probabilities separately to allow for the possibility e.g. that an image contains both a human and a dog, so they will not sum to 1 in general.

The /predict_zip endpoint returns a JSON object mapping file paths to model probabilities formatted as above.

During development, it is convenient to run the app in debug mode with the local directory mounted to the Docker container so that changes you make locally are reflected in the service immediately:

docker run \
    -it \
    -v "${PWD}/autofocus/predict:/image_api" \
    -p 8000:8000 \
    gsganden/autofocus_serve python app/app.py

Getting the Model

The app described above uses a multilabel fast.ai model. You can download that model directly with the following command. This command was written to run from the repo root.

This dataset is no longer available.

autofocus/train_model/train_multilabel_model.ipynb contains the code that was used to train and evaluate this model.

Getting the Data

The model described above was trained on a set of images provided by the Lincoln Park Zoo's Urban Wildlife Institute that were taken in the Chicago area in mid-summer 2016 and 2017. If you wish to train your own model, you can use the instructions below to download that dataset and other related datasets.

If necessary, create an AWS account, install the AWS CLI tool (pip install awscli), and set up your AWS config and credentials (aws configure). All of the commands below are written to run from the repo root.

Use this commend to download a preprocessed version of the Lincoln Park Zoo 2016-2017 dataset to autofocus/data/ (you can change the destination directory if you like):

No longer available

Unpack the tarfile:

mkdir $(pwd)/data/lpz_2016_2017/
tar -xvf $(pwd)/data/${FILENAME} -C $(pwd)/data/lpz_2016_2017/

Delete the tarfile:

rm $(pwd)/data/${FILENAME}

This dataset contains approximately 80,000 images and a CSV of labels and image metadata. It occupies 17.1GB uncompressed, so you will need about 40GB free for the downloading and untarring process. The images have been preprocessed by trimming the bottom 198 pixels (which often contains a metadata footer that could only mislead a machine learning model) and resizing to be 512 pixels along their shorter dimension. In addition, the labels have been cleaned up and organized.

If you would like to work with data that has not been preprocessed as described above, replace FILENAME=lpz_2016_2017_processed.tar.gz with FILENAME=data_2016_2017.tar.gz. You will need to have about 100GB free to download and untar the raw data. This dataset is no longer available. autofocus/build_dataset/lpz_2016_2017/process_raw.py contains the code that was used to generate the processed data from the raw data.

A second dataset from the Lincoln Park Zoo's Urban Wildlife Institute contains approximately 75,000 images (227 x 227 pixels) and a CSV of labels and image metadata from the Chicago area in 2012-2014. It takes up 7.9GB uncompressed. To get this data, follow the same steps as for the 2016-2017 dataset, but replace FILENAME=lpz_2016_2017_processed.tar.gz with FILENAME=lpz_2012-2014.tar.gz, and use this command to unpack the tarfile. This dataset is no longer available.

tar -xvf $(pwd)/data/${FILENAME} -C $(pwd)/data/

A third dataset from the Lincoln Park Zoo's Urban Wildlife Institute contains unlabeled three-image bursts from 2018. It takes up 5.7GB uncompressed. To get this data, follow the same steps as for the 2012-2014 dataset, but replace FILENAME=lpz_2016_2017_processed.tar.gz with FILENAME=lpz_2018.tar.gz. This dataset is no longer available.

Running Tests

To test the app, run pip install -r requirements-dev.txt and then pytest. The tests assume that the app is running locally on port 8000 according to the instructions above.``

Example Images

buck

fawn

racoons

autofocus's People

Contributors

bburns632 avatar chronocook avatar davidwilby avatar gsganden avatar jameslamb avatar jayqi avatar manusreekumar avatar mfidino avatar parsing-science avatar rahulgurnani avatar sourcery-ai-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

autofocus's Issues

Log git diff instead of source file

mlflow is currently logging the entire retrain.py file. It would be sufficient instead to log the git hash for the most recent commit and the diff relative to that commit.

CI is not setup

Should set up Travis CI to do tests on PRs and merges to master.

Incorporate a background subtraction step

E.g. cluster empty images, and for each new image identify the closest cluster and apply standard background detection techniques. Could use as a preprocessing step for a classifier or to highlight areas for manual inspection in a GUI.

Provide option to split by time for testing and validation

Splitting without regard to time can lead to data leakage and give upwardly biased estimate of how the model will perform in the future. Should use timestamps to avoid splitting within a sequence of images that are close together in time, which might show the same animals in only slightly different positions and poses.

Add description in README

It's hard to tell what this project actually does when landing on the frontpage. Ideally, the first paragraph should explain what the package does.

Try pretraining on Zamba dataset

We are generally using models pretrained on ImageNet. We could use models trained on https://github.com/drivendataorg/zamba (another camera traps project) either instead of those ImageNet-trained models or as an intermediate step after the ImageNet training and before training on our own images. It's plausible that this approach could help because Zamba data is more similar to ours in some way than ImageNet images. One challenge is that the Zamba data takes the form of video clips rather than single images.

Do more augmentations

Given that the animals often occupy only a small part of the image, we may get better performance with a lot of augmentations involving high zoom. This kind of approach is often used with medical images, e.g. where a tumor might occupy a small part of the image.

Try converting convolutional kernels of pretrained network to grayscale and applying to night images

Camera trap images taken at night use infrared light to create a grayscale image. CNNs pretrained on ImageNet may not handle these images well. One possible approach is to convert the convolutional kernels in one of these networks to grayscale and then finetune on the night images (using a separate network for daytime images).

If straight conversion to grayscale doesn't work, we might try style transfer.

I am not aware of research on applying ImageNet-pretrained models to grayscale or more specifically IR images, so I am not sure that this approach is best.

Provide results from a metadata-only model as a baseline

A computer vision model should be able to outperform a model based only on e.g. time or day and location. If it does not, then it may just be learning to identify those features rather than learning to identify animal kinds.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.