Authors: Denizhan Kara, Tomoyoshi Kimura, Shengzhong Liu, Jinyang Li, Dongxin Liu, Tianshi Wang, Ruijie Wang, Yizhuo Chen, Yigong Hu, Tarek Abdelzaher
Link [pdf]
This paper presents FreqMAE, a novel self-supervised learning framework that integrates masked autoencoding with physics-informed insights for IoT sensing applications. It captures feature patterns from multi-modal IoT sensing signals, enhancing latent feature space representation to reduce reliance on data labeling and improve AI task accuracy. Unlike methods dependent on data augmentations, FreqMAE's approach eliminates the need for handcrafted transformations by utilizing insights from the frequency domain. We showcase three primary contributions:
- Temporal-Shifting Transformer (TS-T) Encoder: Enables temporal interactions while distinguishing different frequency regions.
- Factorized Multimodal Fusion: Leverages cross-modal correlations while preserving modality-specific features.
- Hierarchically Weighted Loss Function: Prioritizes reconstruction of crucial frequency components and high SNR samples.
Comprehensive evaluations on sensing applications confirm FreqMAE's effectiveness in reducing labeling requirements and enhancing domain shift resilience.
-
Dependencies: Create a conda environment with Python 3.10 for isolation.
conda create -n freqmae_env python=3.10; conda activate freqmae_env
-
Installation: Clone the repository and install the necessary packages.
git clone [repo] freqmae_dir cd freqmae_dir pip install -r requirements.txt
We evaluate FreqMAE using two IoT sensing applications: Moving Object Detection (MOD) and a self-collected dataset using acoustic and seismic signals for vehicle classification. Data preprocessing involves spectrogram generation and masking, detailed further in the supplementary materials.
MOD is a self-collected dataset that uses acoustic (8000Hz) and seismic (100Hz) signals to classify types of moving vehicles. The pre-training dataset includes data from 10 classes, and the downstream tasks include vehicle classification, distance classification, and speed classification.
- Update the configuration in
src/data_preprocess/MOD/preprocessing_configs.py
. - Navigate to the data preprocessing directory:
cd src/data_preprocess/MOD/
- Execute the sample extraction and data partitioning scripts as detailed below.
- Pretrain Data Extraction: Generate samples for unsupervised pretraining.
python extract_pretrain_samples.py
- Supervised Data Extraction: Generate labeled samples for supervised fine-tuning.
python extract_samples.py
- Data Partitioning: Partition data into training, validation, and test sets.
python partition_data.py
- Supervised Training: Use the following command for training with a specific model and dataset.
python train.py -model=[MODEL] -dataset=[DATASET]
- FreqMAE Pre-Training: Pre-train the model using FreqMAE.
python train.py -model=[MODEL] -dataset=[DATASET] -learn_framework=FreqMAE
- Fine-Tuning: After pre-training, fine-tune for a specific task.
python train.py -model=[MODEL] -dataset=[DATASET] -learn_framework=FreqMAE -task=[TASK] -stage=finetune -model_weight=[PATH TO MODEL WEIGHT]
See src/data/*.yaml
for model configurations specific to each dataset.
This project is released under the MIT License. See LICENSE
for details.
Please cite our paper if you use this code or dataset in your research:
@inproceedings{freqmae2024,
title={FreqMAE: Frequency-Aware Masked Autoencoder for Multi-Modal IoT Sensing Applications},
author={Kara, Denizhan and Kimura, Tomoyoshi and Liu, Shengzhong and Li, Jinyang and Liu, Dongxin and Wang, Tianshi and Wang, Ruijie and Chen, Yizhuo and Hu, Yigong and Abdelzaher, Tarek},
booktitle={Proceedings of the ACM Web Conference 2024},
year={2024}
}
For any inquiries regarding the paper or the code, please reach out to us:
- Denizhan Kara: [email protected]
- Tomoyoshi Kimura: [email protected]