GithubHelp home page GithubHelp logo

Hi there 👋

Rongjie Huang (黄融杰) did my Graduate study at College of Computer Science and Software, Zhejiang University, supervised by Prof. Zhou Zhao. I also obtained Bachelor’s degree at Zhejiang University. During my graduate study, I was lucky to collaborate with the CMU Speech Team led by Prof. Shinji Watanabe, and Audio Research Team at Zhejiang University. I was grateful to intern or collaborate at TikTok, Shanghai AI Lab (OpenGV Lab), Tencent Seattle Lab, Alibaba Damo Academic, with Yi Ren, Jinglin Liu, Chunlei Zhang and Dong Yu.

My research interest includes Multi-Modal Generative AI, Multi-Modal Language Processing, and AI4Science. I have published first-author papers at the top international AI conferences such as NeurIPS/ICLR/ICML/ACL/IJCAI.

I am actively looking for academic collaboration, feel free to drop me an email.

📎 Homepages

💻 Selected Research Papers

Generative AI for Speech, Sing, and Audio: Spoken Large Language Model, Text-to-Audio Synthesis, Text-to-Speech Synthesis, Singing Voice Synthesis

Audio-Visual Language Processing: Audio-Visual Speech-to-Speech Translation, Self-Supervised Learning

My full paper list is shown at my personal homepage.

Spoken Large Language Model

Text-to-Speech Synthesis

Text-to-Audio Synthesis

Audio-Visual Language Processing

Singing Voice Synthesis

rongjiehuang's Projects

2019_algorithm_intern_information icon 2019_algorithm_intern_information

2020年的算法实习岗位/校招公司信息表,部分包括内推码,和常见深度学习算法岗面试题及答案,暑期计算机视觉实习面经和总结

955.wlb icon 955.wlb

955 不加班的公司名单 - 工作 955,work–life balance (工作与生活的平衡)

996.icu icon 996.icu

Repo for counting stars and contributing. Press F to pay respect to glorious developers.

awesome-courses icon awesome-courses

:books: List of awesome university courses for learning Computer Science!

fairseq icon fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

fastdiff icon fastdiff

PyTorch Implementation of FastDiff (IJCAI'22)

generspeech icon generspeech

PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.

leetcodeanimation icon leetcodeanimation

Demonstrate all the questions on LeetCode in the form of animation.(用动画的形式呈现解LeetCode题目的思路)

multiband-wavernn icon multiband-wavernn

An unofficial implement of autoregressive vocoder Multiband-WaveRNN. Audio samples in https://rongjiehuang.github.io/Multiband-WaveRNN/

paddlespeech icon paddlespeech

An Easy-to-use Speech Toolkit including SOTA ASR pipeline, influential TTS with text frontend and End-to-End Speech Simultaneous Translation.

prodiff icon prodiff

PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline

pumpkin-book icon pumpkin-book

《机器学习》(西瓜书)公式推导解析,在线阅读地址:https://datawhalechina.github.io/pumpkin-book

singgan icon singgan

Project page for SingGAN (ACM-MM' 2022): Generative Adversarial Network For High-Fidelity Singing Voice Generation

transpeech icon transpeech

PyTorch Implementation of TranSpeech (ICLR'23): Textless NAR Speech-to-Speech Translation with Bilateral Perturbation

waterco icon waterco

基于Mask R-CNN的水下垃圾检测

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.