View Code? Open in Web Editor NEW

Two Stacks Are Better Than One: A Comparison of Language Modeling and Translation as Multilingual Pretraining Objectives

Python 92.08% Shell 7.92%

lm-vs-mt's Introduction

Two Stacks Are Better Than One: A Comparison of Language Modeling and Translation as Multilingual Pretraining Objectives

Overview

This repository contains the code and data for the paper "Two Stacks Are Better Than One: A Comparison of Language Modeling and Translation as Multilingual Pretraining Objectives". The paper investigates the impact of different pretraining objectives on multilingual language models and compares their performance on various downstream tasks.

Abstract

Pretrained language models (PLMs) have shown impressive performance and garnered significant attention in the NLP community. This paper compares multilingual pretraining objectives under controlled conditions, focusing on two main observations: (1) the architecture dictates the optimal pretraining objective, and (2) multilingual translation can be a highly effective pretraining objective.

Features

Controlled Evaluation: Ensures comparability by using consistent training data and model architectures.
Multilingual Focus: Evaluates performance across six languages.
Pretraining Models: Compares BART architecture with machine translation objective (2-MT), BART architecture with denoising objective (2-LM), masked language modeling (MLM), causal language modeling (CLM), and translation language modeling (TLM).
Downstream Tasks: Includes sentiment analysis (SA), named entity recognition (NER), and part-of-speech (POS) tagging.

Usage

This repo has git-lfs enabled as we also host fairseq model weights (./model_weights) in this repo. You can skip those LFS by
GIT_LFS_SKIP_SMUDGE=1 git clone [email protected]:Helsinki-NLP/lm-vs-mt.git

Citation

@article{li2024stacksbetteronecomparison,
      title={Two Stacks Are Better Than One: A Comparison of Language Modeling and Translation as Multilingual Pretraining Objectives}, 
      author={Zihao Li and Shaoxiong Ji and Timothee Mickus and Vincent Segonne and Jörg Tiedemann},
      year={2024},
      eprint={2407.15489},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2407.15489}, 
}

Recommend Projects

helsinki-nlp / lm-vs-mt Goto Github PK

lm-vs-mt's Introduction

Two Stacks Are Better Than One: A Comparison of Language Modeling and Translation as Multilingual Pretraining Objectives

Overview

Abstract

Features

Usage

Citation

lm-vs-mt's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs