GithubHelp home page GithubHelp logo

freqmae's Introduction

FreqMAE: Frequency-Aware Masked Autoencoder for Multi-Modal IoT Sensing Applications

Authors: Denizhan Kara, Tomoyoshi Kimura, Shengzhong Liu, Jinyang Li, Dongxin Liu, Tianshi Wang, Ruijie Wang, Yizhuo Chen, Yigong Hu, Tarek Abdelzaher

Link [pdf]

Overview

This paper presents FreqMAE, a novel self-supervised learning framework that integrates masked autoencoding with physics-informed insights for IoT sensing applications. It captures feature patterns from multi-modal IoT sensing signals, enhancing latent feature space representation to reduce reliance on data labeling and improve AI task accuracy. Unlike methods dependent on data augmentations, FreqMAE's approach eliminates the need for handcrafted transformations by utilizing insights from the frequency domain. We showcase three primary contributions:

  1. Temporal-Shifting Transformer (TS-T) Encoder: Enables temporal interactions while distinguishing different frequency regions.
  2. Factorized Multimodal Fusion: Leverages cross-modal correlations while preserving modality-specific features.
  3. Hierarchically Weighted Loss Function: Prioritizes reconstruction of crucial frequency components and high SNR samples.

Comprehensive evaluations on sensing applications confirm FreqMAE's effectiveness in reducing labeling requirements and enhancing domain shift resilience.

Installation and Requirements

  1. Dependencies: Create a conda environment with Python 3.10 for isolation.

    conda create -n freqmae_env python=3.10; conda activate freqmae_env
  2. Installation: Clone the repository and install the necessary packages.

    git clone [repo] freqmae_dir
    cd freqmae_dir
    pip install -r requirements.txt

Dataset

We evaluate FreqMAE using two IoT sensing applications: Moving Object Detection (MOD) and a self-collected dataset using acoustic and seismic signals for vehicle classification. Data preprocessing involves spectrogram generation and masking, detailed further in the supplementary materials.

Moving Object Detection (MOD)

MOD is a self-collected dataset that uses acoustic (8000Hz) and seismic (100Hz) signals to classify types of moving vehicles. The pre-training dataset includes data from 10 classes, and the downstream tasks include vehicle classification, distance classification, and speed classification.

Data Preprocessing

  1. Update the configuration in src/data_preprocess/MOD/preprocessing_configs.py.
  2. Navigate to the data preprocessing directory:
    cd src/data_preprocess/MOD/
  3. Execute the sample extraction and data partitioning scripts as detailed below.

Sample Extraction and Data Partition

  • Pretrain Data Extraction: Generate samples for unsupervised pretraining.
    python extract_pretrain_samples.py
  • Supervised Data Extraction: Generate labeled samples for supervised fine-tuning.
    python extract_samples.py
  • Data Partitioning: Partition data into training, validation, and test sets.
    python partition_data.py

Usage

Training and Fine-Tuning

  • Supervised Training: Use the following command for training with a specific model and dataset.
    python train.py -model=[MODEL] -dataset=[DATASET]
  • FreqMAE Pre-Training: Pre-train the model using FreqMAE.
    python train.py -model=[MODEL] -dataset=[DATASET] -learn_framework=FreqMAE
  • Fine-Tuning: After pre-training, fine-tune for a specific task.
    python train.py -model=[MODEL] -dataset=[DATASET] -learn_framework=FreqMAE -task=[TASK] -stage=finetune -model_weight=[PATH TO MODEL WEIGHT]

Model Configurations

See src/data/*.yaml for model configurations specific to each dataset.

License

This project is released under the MIT License. See LICENSE for details.

Citation

Please cite our paper if you use this code or dataset in your research:

@inproceedings{freqmae2024,
  title={FreqMAE: Frequency-Aware Masked Autoencoder for Multi-Modal IoT Sensing Applications},
  author={Kara, Denizhan and Kimura, Tomoyoshi and Liu, Shengzhong and Li, Jinyang and Liu, Dongxin and Wang, Tianshi and Wang, Ruijie and Chen, Yizhuo and Hu, Yigong and Abdelzaher, Tarek},
  booktitle={Proceedings of the ACM Web Conference 2024},
  year={2024}
}

Contact

For any inquiries regarding the paper or the code, please reach out to us:

freqmae's People

Contributors

denizhankara avatar

Stargazers

GR Yan avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.