GithubHelp home page GithubHelp logo

he-shuwei / diff-bgm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sizhelee/diff-bgm

1.0 0.0 0.0 24.03 MB

official code for CVPR'24 paper Diff-BGM

Shell 0.01% Python 95.50% Jupyter Notebook 4.49%

diff-bgm's Introduction

Diff-BGM: A Diffusion Model for Video Background Music Generation

Official implementation for CVPR 2024 paper: Diff-BGM: A Diffusion Model for Video Background Music Generation

By Sizhe Li, Yiming Qin, Minghang Zheng, Xin Jin, Yang Liu.

1. Installation

pip install -r requirements.txt
pip install -e diffbgm
pip isntall -e diffbgm/mir_eval

2. Training

Preparations

  1. The extracted features of the dataset POP909 can be accessed here. Please put it under /data/ after extraction.

  2. The extracted features of the dataset BGM909 can be accessed here. Please put them under /data/bgm909/ after extraction. We use VideoCLIP to extract the video feature, use BLIP to gain the video caption then use Bert-base-uncased as the language encoder and use TransNetV2 to capture the shot.
    We also provide the original captions here.

  3. The needed pre-trained models for training can be accessed here. Please put them under /pretrained/ after extraction.

Commands

python diffbgm/main.py --model ldm_chd8bar --output_dir [output_dir]

3. Inference

Please use the following message to generate music for videos in BGM909.

python diffbgm/inference_sdf.py --model_dir=[model_dir] --uncond_scale=5.

4. Test

To reproduce the metrics in our original paper, please refer to /diffbgm/test.ipynb.

Backbone PCHE GPS SI P@20 Weights
Diff-BGM (original) 2.840 0.601 0.521 44.10 weights
Diff-BGM (only visual) 2.835 0.514 0.396 43.20 weights
Diff-BGM (w/o SAC-Att) 2.721 0.789 0.523 38.47 weights

We provide our generation results here.

5. Make a Demo by yourself!

After generating a piece of music, you can use the following commands to generate a video.

sudo apt-get install ffmpeg fluidsynth
fluidsynth -i <SoundFont file> <midi file> -F <wav file>
ffmpeg -i <wav file> -b:a <bit rate> <mp3 file>
ffmpeg -i <video file> -i <mp3 file> -c:a aac -map 0:v:0 -map 1:a:0 <output file>

See our demo!

diff-bgm's People

Contributors

sizhelee avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.