This is a Pytorch codebase for following two papers:
- Caption Alignment for Low Resource Audio-Visual Data
- Multi-Modal and Multi-Lingual Temporal Sentence Localization in Videos
- src/scripts/preprocess/shortening_atma.py
- src/scripts/preprocess/video_list_generator.py
- src/scripts/preprocess/Annotation_list_generator.py
- src/scripts/preprocess/generate_audio_files.py
- src/scripts/feature_generation/audio_feature_generation/vgg/gen_audio_set_feats.py
- src/scripts/feature_generation/video_feature_generation/c3d/feature_extractor_vid.py
- src/scripts/preprocess/make_audio_h5py.py
- src/scripts/preprocess/generate_train_test_path.py
- src/scripts/preprocess/parse_annotation_file.py
- src/scripts/preprocess/generate_time_duration_file.py
- src/scripts/preprocess/generate_no_label.py
- src/scripts/preprocess/generate_splitlist.py
- src/scripts/preprocess/generate_h5py.py
- src/scripts/preprocess/generate_h5pyWithCaption.py
- src/scripts/train/main.py
- Generate appropriate audio features from src/scripts/feature_generation/
- Change AUDIO_FEATURES_TO_USE variable in src/scripts/constants.py
- src/scripts/preprocess/make_audio_h5py.py
- src/scripts/preprocess/generate_no_label.py
- src/scripts/preprocess/generate_h5py.py
- src/scripts/preprocess/generate_h5pyWithCaption.py
All parameters related to Preprocessing and Feature Generation can be configued in :
- src/scripts/contants.py
- src/scripts/paths.py
All parameters related to Training can be configured in :
- src/scripts/train/config.py