The distant_conversational_asr_and_analysis from geronimo03

[ICASSP 2021 Tutorial] Distant conversational speech recognition and analysis: Recent advances, and trends towards end-to-end optimization

This repository contains a set of materials used for the ICASSP2021 Tutorial T9 "Distant conversational speech recognition and analysis: Recent advances, and trends towards end-to-end optimization" presented by Keisuke Kinoshita, Yusuke Fujita, Naoyuki Kanda, Shinji Watanabe.

Slides

PDF slides (latest)

Abstract

Recognizing unsegmented conversational speech recorded with distant microphone(s) is a challenging but an essential task to be solved to unfold a myriad of new speech applications, such as a communication agent that can understand, respond to and facilitate our conversation. This task contains a number of subtasks, which has been studied rather independently for a decade, such as multichannel/single-channel source separation, speaker diarization with source number counting, and conversational speech recognition. This tutorial first revisits, with demonstration, current state-of-the-art systems for this task, which were developed for challenges such as CHiME 5-6 challenges, and commercial products. These systems typically consist of a combination of well-established independently optimized modules. While these systems are designed carefully to consolidate these independent modules, there is still a large room for improvement. In the latter part of the tutorial, we introduce a recent new research trend that aims to establish an optimal joint neural system that solves those subtasks all together, through end-to-end optimization based on common integrated objective. By showing the potential of such jointly-optimal systems that now start outperforming previous top-performing systems in many tasks, we discuss the future directions and challenges for this task from both industry and academic perspectives.

Tutorial Presentors

Keisuke Kinoshita (NTT Corporation, Japan) (Email)
Yusuke Fujita (Line Corporation, Japan)
Naoyuki Kanda (Microsoft, USA)
Shinji Watanabe (Carnegie Mellon University, USA)

Changelog

1.0.0 / 2021-06-07

First public release

geronimo03 / distant_conversational_asr_and_analysis Goto Github PK

distant_conversational_asr_and_analysis's Introduction

[ICASSP 2021 Tutorial] Distant conversational speech recognition and analysis: Recent advances, and trends towards end-to-end optimization

Slides

Abstract

Tutorial Presentors

Changelog

1.0.0 / 2021-06-07

distant_conversational_asr_and_analysis's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs