GithubHelp home page GithubHelp logo

baiiiiiiiiii / codec-superb Goto Github PK

View Code? Open in Web Editor NEW

This project forked from voidful/codec-superb

0.0 0.0 0.0 3.06 MB

Audio Codec Speech processing Universal PERformance Benchmark

Home Page: https://codecsuperb.com

JavaScript 6.05% Python 88.95% CSS 1.51% HTML 1.39% Dockerfile 2.09%

codec-superb's Introduction

Codec-SUPERB: Sound Codec Speech Processing Universal Performance Benchmark

Overview

Codec-SUPERB is a comprehensive benchmark designed to evaluate audio codec models across a variety of speech tasks. Our goal is to facilitate community collaboration and accelerate advancements in the field of speech processing by preserving and enhancing speech information quality.

Table of Contents

Introduction

Codec-SUPERB sets a new benchmark in evaluating sound codec models, providing a rigorous and transparent framework for assessing performance across a range of speech processing tasks. Our goal is to foster innovation and set new standards in audio quality and processing efficiency.

Key Features

Out-of-the-Box Codec Interface

Codec-SUPERB offers an intuitive, out-of-the-box codec interface that allows for easy integration and testing of various codec models, facilitating quick iterations and experiments.

Multi-Perspective Leaderboard

Codec-SUPERB's unique blend of multi-perspective evaluation and an online leaderboard drives innovation in sound codec research by providing a comprehensive assessment and fostering competitive transparency among developers.

Standardized Environment

We ensure a standardized testing environment to guarantee fair and consistent comparison across all models. This uniformity brings reliability to benchmark results, making them universally interpretable.

Unified Datasets

We provide a collection of unified datasets, curated to test a wide range of speech processing scenarios. This ensures that models are evaluated under diverse conditions, reflecting real-world applications.

Installation

git clone https://github.com/voidful/Codec-SUPERB.git
cd Codec-SUPERB
pip install -r requirements.txt

Usage

Out of the Box Codec Interface

from SoundCodec import codec
import torchaudio

# get all available codec
print(codec.list_codec())
# load codec by name, use encodec as example
encodec_24k_6bps = codec.load_codec('encodec_24k_6bps')

# load audio
waveform, sample_rate = torchaudio.load('sample audio')
resampled_waveform = waveform.numpy()[-1]
data_item = {'audio': {'array': resampled_waveform,
                       'sampling_rate': sample_rate}}

# extract unit
sound_unit = encodec_24k_6bps.extract_unit(data_item).unit

# sound synthesis
decoded_waveform = encodec_24k_6bps.synth(sound_unit, local_save=False)['audio']['array']

Citation

If you use this code or result in your paper, please cite our work as:

@misc{wu2024codecsuperb,
      title={Codec-SUPERB: An In-Depth Analysis of Sound Codec Models}, 
      author={Haibin Wu and Ho-Lam Chung and Yi-Cheng Lin and Yuan-Kuei Wu and Xuanjun Chen and Yu-Chi Pai and Hsiu-Hsuan Wang and Kai-Wei Chang and Alexander H. Liu and Hung-yi Lee},
      year={2024},
      eprint={2402.13071},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}
@article{wu2024towards,
  title={Towards audio language modeling-an overview},
  author={Wu, Haibin and Chen, Xuanjun and Lin, Yi-Cheng and Chang, Kai-wei and Chung, Ho-Lam and Liu, Alexander H and Lee, Hung-yi},
  journal={arXiv preprint arXiv:2402.13236},
  year={2024}
}

Contribution

Contributions are highly encouraged, whether it's through adding new codec models, expanding the dataset collection, or enhancing the benchmarking framework. Please see CONTRIBUTING.md for more details.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Reference Sound Codec Repositories:

codec-superb's People

Contributors

voidful avatar hbwu-ntu avatar stanwang1210 avatar ywk991112 avatar baiiiiiiiiii avatar kuan2jiu99 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.