GithubHelp home page GithubHelp logo

ntaylorox / sled Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mivg/sled

0.0 0.0 0.0 49 KB

The official repository for Efficient Long-Text Understanding Using Short-Text Models (Ivgi et al., 2022) paper

License: MIT License

Python 100.00%

sled's Introduction

SLED

The official repository for Efficient Long-Text Understanding Using Short-Text Models (Ivgi et al., 2022).

SLED models use pretrained, short-range encoder-decoder models, and apply them over. long-text inputs by splitting the input into multiple overlapping chunks, encoding each independently and perform fusion-in-decoder.

Data

The data for this paper is hosted on the dataset hub here. It is based on the SCROLLS dataset (paper), the SQuAD 1.1 dataset (paper) and the HotpotQA dataset (paper). It doesn't contain any unpublished data, but includes the configuration needed for the paper.

Usage example :

from datasets import load_dataset
qasper = load_dataset("tau/sled","qasper")

Installation

Make sure to install pytorch according to your machine spec. See installation options here.

Installing SLED is easy with pip.

pip install py-sled

Some backbone models require additional dependencies. If you wish to work with T5 for example, you can install using.

pip install py-sled[t5]

If you wish to run the examples, install the required dependencies with

pip install py-sled[examples]

If you wish to continue developing this repository, install the full development requirments with

pip install py-sled[dev]

Usage

Working with SLED is seamless when working with HuggingFace's Transformers AutoClasses.

A minimal usage example:

import sled  # ** required so SLED would be properly registered by the AutoClasses **
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained('tau/bart-base-sled')
model = AutoModel.from_pretrained('tau/bart-base-sled')
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model(**inputs)
last_hidden_states = outputs.last_hidden_state

Important: You need to import sled before using the AutoClass (e.g. AutoModel.from_pretrained('tau/bart-base-sled)) for it to work.

Minimal working example can be found here.

To work with SCROLLS like data that was used for the paper, see here.

Custom datasets

For SLED to be able to prepend the prefix input to every chunk, it requires the input tensor prefix_length. If using a custom dataset, you can refer to run.py for the correct way to preprocess the data.

Note: Currently, HF's Seq2SeqTrainer doesn't pass the prefix_length tensor in the prediction loop, so you should use the CustomSeq2SeqTrainer or something similar until it is fixed.

Backbone models

There are multiple model cards available on HuggingfaceHub including

If you wish to use a custom model that is available as a model card (public or private) on the hub, or use different parameters for SLED, you can create a json config file like the below, and change the underlying_config to your custom model card.

{
  "model_type": "tau/sled",
  "underlying_config": "facebook/bart-base",
  "context_size": 256,
  "window_fraction": 0.5,
  "prepend_prefix": true,
  "encode_prefix": true,
  "sliding_method": "dynamic"
}

You can then load it like below

import sled
from transformers import AutoModelForSeq2SeqLM
custom_sled_model = AutoModelForSeq2SeqLM.from_pretrained(<your custom json config>)

Citation

If you use this repository, please cite as below:

@inproceedings{Ivgi2022EfficientLU,
  title={Efficient Long-Text Understanding with Short-Text Models},
  author={Maor Ivgi and Uri Shaham and Jonathan Berant},
  year={2022}
}

Disclaimer

This repository is still under active development, and may contain some unintended behavior. Please open an issue if any unexpected behaviour occurs, and we will promptly try to fix it.

sled's People

Contributors

mivg avatar mivg1 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.