GithubHelp home page GithubHelp logo

americasnlp2022's Introduction

AmericasNLP 2022

Important Dates

  • Submission deadline for ASR and Speech-to-text translation tasks: October 14, 2022
  • Submission deadline for machine translation task: October 25, 2022
  • Results announcement: October 29, 2022

Submission

The official submission leaderboards can be found at the following links:

Languages

Code Language Translation Pair
bzd Bribri Spanish
gn Guaraní Spanish
gvc Kotiria Portuguese
tav Wa'ikhana Portuguese
quy Quechua Spanish

Data

Test files for the ASR task are available here.

Downloading

The data for the competition can be found here. Alternatively, you can use the provided download script to automatically download the data for all languages. The script takes a single argument, which is the folder in which to download the data to:

./download_data.sh destination_folder

Data format

Each language folder contains two subfolders, each corresponding to a different training split. In each subfolder, there are multiple audio files, and a single tsv file containing all transcriptions and translations. Audio files are split such that each file contains a single sentence or utterance. The tsv file is structured as follows:

Header Content
wav The corresponding audio filename.
source_processed A processed version of the audio transcription.
source_raw The original raw transcript. We ask that you use this data for training and evaluation, and to ignore the previous column.
target_raw The translation of the transcription into either Spanish or Portuguese.

Baselines

ASR Baseline

The baseline model for the ASR task has been implemented in espnet. The scripts to run the model can be found in the following directory of the espnet repository.

americasnlp2022's People

Contributors

abteen avatar adamits avatar

Stargazers

 avatar David Chushig-Muzo avatar Alex Nisnevich avatar Daniel Perazzo avatar Gerard I. Gállego avatar Nikhil Desai avatar Luis Armando avatar  avatar

Watchers

 avatar  avatar  avatar

americasnlp2022's Issues

Leaderboard

Hello! Will there be any kind of public leaderboard available for this task?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.