GithubHelp home page GithubHelp logo

tub-rip / event_penguins Goto Github PK

View Code? Open in Web Editor NEW
9.0 2.0 0.0 1.18 MB

The official implementation of "Low-power, Continuous Remote Behavioral Localization with Event Cameras" (CVPR 2024)

License: MIT License

Python 100.00%

event_penguins's Introduction

Event Penguins (CVPR 2024)

This is the official repository for Low-power, Continuous Remote Behavioral Localization with Event Cameras accepted at CVPR 2024 by Friedhelm Hamann, Suman Ghosh, Ignacio Juarez Martinez, Tom Hart, Alex Kacelnik, Guillermo Gallego

Low-power, Continuous Remote Behavioral Localization with Event Cameras

Citation

If you use this work in your research, please consider citing:

@InProceedings{Hamann24cvpr,
    author    = {Hamann, Friedhelm and Ghosh, Suman and Martinez, Ignacio Juarez and Hart, Tom and Kacelnik, Alex and Gallego, Guillermo},
    title     = {Low-power Continuous Remote Behavioral Localization with Event Cameras},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {18612-18621}
}

Quickstart

Setup

You can use Miniconda to set up an environment:

conda create --name eventpenguins python=3.8
conda activate eventpenguins

Install PyTorch by choosing a command that matches your CUDA version. You can find the compatible commands on the PyTorch official website (tested with PyTorch 2.2.2), e.g.:

conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia

Install other required packages:

pip install -r requirements.txt

Preprocessing the data

Create a folder for the data

cd <project-root>
mkdir data

Download the data and save it in <project-root>/data create the pre-processed dataset with the following command:

python scripts/preprocess.py --data_root data/EventPenguins --output_dir data --recording_info_path config/annotations/recording_info.csv

This crops the events according to the pre-annotated nests and stores the recordings according to the split specified in the paper.

Inference

Create a folder models (mkdir models), download the pre-trained model weights from here and save them in the model folder. Now you can run inference with the following command:

python scripts/inference.py --config config/exp/inference.yaml --verbose

Details

Original Data

The EventPenguins dataset contains 24 ten-minute recordings, with 16 annotated nests. An overview of the data can be found in config/annotations/recording_info.csv. Each recording has a roi_group_id, which links to the location of the 16 pre-annotated regions of interest, which can be found in config/annotations/rois (new set of ROI's when camera was moved). The dataset is structured as followed:

|── EventPenguins
│   ├── <yy-mm-dd>_<hh-mm-ss> # (those folders are referred to as "recordings")
│              ├── frames/
│                  ├── 000000000000.png
│                  └── 000000000001.png
│                    ...
│              ├── events.h5
│              ├── frame_timestamps.txt  # [us]
│              └── metadata.yaml       
│              ...

Please note that we do not use the grayscale frames in our method but provide them for completeness.

Pre-processed Data

Structure

The processed data is stored in a single HDF5 file named preprocessed.h5. The file structure is organized as follows:

  • Each ten minute recording is stored in a group labeled by its timestamp (e.g., 22-01-12_17-26-00).
  • Each group (timestamp) contains multiple subgroups, each corresponding to a specific ROI (nest) identified by an ID (e.g., N01).
  • Each ROI subgroup contains:
    • An events dataset, where each event is represented as a row [x, y, t, p] indicating the event's x-position, y-position, timestamp (us), and polarity, respectively.
    • Attributes height and width indicating the dimensions of the ROI.

Attributes

Each subgroup (ROI) has the following attributes:

  • height: The height of the ROI in pixels.
  • width: The width of the ROI in pixels.

Each main group (recording timestamp) has the following attribute:

  • split: Indicates the data split (e.g., train, test, validate) that the recording belongs to.

Annotations

The annotations are in config/annotations/annotations.json. The structure is very similar to ActivityNet, with an additional layer to consider different nests.

{
  "version": "VERSION 0.0",
  "database": {
    "<yy-mm-dd>_<hh-mm-ss>": {
      "annotations": {
        "<roi_id>": [
          {
            "label": <label>,
            "segment": [
              <t_start>
              <t_end>
            ]
          },
          ...
  • <yy-mm-dd>_<hh-mm-ss> is the identifier for a ten-minute recording
  • roi_id is an integer number encoding the nest
  • t_start and t_end is the start and end of an action in seconds
  • the label is one of ["ed", "adult_flap", "chick_flap"].

"adult_flap" and "chick_flap" are other types of wing flapping easily confused with the ecstatic display (ed). We provide the labels for completeness but they are not considered in our method.

Acknowledgements

The evaluation for activity detection is largely inspired by ActivityNet. We thank the authors for their excellent work.

In the Press

Additional Resources

event_penguins's People

Contributors

friedhelmhamann avatar ghoshsuman avatar

Stargazers

 avatar Junming WANG avatar  avatar  avatar Zuntao Liu avatar  avatar  avatar  avatar Guillermo Gallego avatar

Watchers

Shintaro Shiba avatar Kostas Georgiou avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.