GithubHelp home page GithubHelp logo

yingyibiao / gft Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kwchurch/gft

0.0 0.0 0.0 31.43 MB

General Fine-Tuning: A little language for Deep Nets (ACL-2022 Tutorial)

License: Apache License 2.0

Shell 0.13% Python 74.28% HTML 25.56% Dockerfile 0.03%

gft's Introduction

gft (general fine-tuning): A Little Language for Deepnets

1-line programs for fine-tuning, inference and more

Quick Links

  1. Papers
    1. ACL-2022 Tutorial
    2. JNLE
  2. Videos ๐Ÿ“ฝ๏ธ
    1. ๐Ÿ“ฝ๏ธ 10 minute TEASER
    2. ๐Ÿ†•๐Ÿ“ฝ๏ธ First half of ACL-2022 Tutorial (1 hour 16 minutes) UNABRIDGED
  3. Installation
  4. Documentation
  5. ACL-2022 Tutorial

Four Functions and Four Arguments

gft contains 4 main functions:

  1. gft_fit: fit a pretrained model to data (aka fine-tuning)
  2. gft_predict: apply a model to inputs (aka inference)
  3. gft_eval: score a model on a split of a dataset
  4. gft_summary: Find good stuff (popular models and datasets), and explain what's in those models and datasets.

These gft functions make use of 4 main arguments (though most arguments in most hubs are also supported):

  1. data: standard datasets hosted on hubs such as HuggingFace, PaddleNLP, or custom datasets hosted on the local filesystem
  2. model: standard models hosted on hubs such as HuggingFace, PaddleNLP, or custom models hosted on the local filesystem
  3. equation: string such as "classify: label ~ text", where classify is a task, and label and text refer to columns in a dataset
  4. task: classify, classify_tokens, classify_spans, classify_audio, classify_images, regress, text-generation, translation, ASR, fill-mask

A Few Simple Examples

Here are some simple examples:

emodel=H:bhadresh-savani/roberta-base-emotion

# Summarize a dataset and/or model
gft_summary --data H:emotion
gft_summary --model $emodel
gft_summary --data H:emotion --model $emodel

# find some popular datasets and models that contain "emotion"
gft_summary --data H:__contains__emotion --topn 5
gft_summary --model H:__contains__emotion --topn 5

# make predictions on inputs from stdin
echo 'I love you.' | gft_predict --task classify

# The default model (for the classification task) performs sentiment analysis
# The model, $emodel, outputs emotion classes (as opposed to POSITIVE/NEGATIVE)
echo 'I love you.' | gft_predict --task classify --model $emodel

# some other tasks (beyond classification)
echo 'I love New York.' | gft_predict --task H:token-classification
echo 'I <mask> you.' | gft_predict --task H:fill-mask

# make predictions on inputs from a split of a standard dataset
gft_predict --eqn 'classify: label ~ text' --model $emodel --data H:emotion --split test

# return a single score (as opposed to a prediction for each input)
gft_eval --eqn 'classify: label ~ text' --model $emodel --data H:emotion --split test

# Input a pre-trained model (bert) and output a post-trained model
gft_fit --eqn 'classify: label ~ text' \
	--model H:bert-base-cased \
	--data H:emotion \
	--output_dir $outdir

Pre-Training, Fine-Tuning and Inference

The table below shows a 3-step recipe, which has become standard in the literature on deep nets.

Step gft Support Description Time Hardware
1 Pre-Training Days/Weeks Large GPU Cluster
2 gft_fit Fine-Tuning Hours/Days 1+ GPUs
3 gft_predict Inference Seconds/Minutes 0+ GPUs

This repo provides support for step 2 (gft_fit) and step 3 (gft_predict). Most gft_fit and gft_predict programs are short (1-line), much shorter than examples such as these, which are typically a few hundred lines of python. With gft, users should not need to read or modify any python code for steps 2 and 3 in the table above.

Step 1, pre-training, is beyond the scope of this work. We recommend starting with models from HuggingFace and PaddleHub/PaddleNLP hubs, as illustrated in the examples above.

Citations, Documentation, etc.

Papers are here and here.

@inproceedings{church-etal-2022-gentle,
    title = "A Gentle Introduction to Deep Nets and Opportunities for the Future",
    author = "Church, Kenneth  and
      Kordoni, Valia  and
      Marcus, Gary  and
      Davis, Ernest  and
      Ma, Yanjun  and
      Chen, Zeyu",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-tutorials.1",
    pages = "1--6",
    abstract = "The first half of this tutorial will make deep nets more accessible to a broader audience, following {``}Deep Nets for Poets{''} and {``}A Gentle Introduction to Fine-Tuning.{''} We will also introduce GFT (general fine tuning), a little language for fine tuning deep nets with short (one line) programs that are as easy to code as regression in statistics packages such as R using glm (general linear models). Based on the success of these methods on a number of benchmarks, one might come away with the impression that deep nets are all we need. However, we believe the glass is half-full: while there is much that can be done with deep nets, there is always more to do. The second half of this tutorial will discuss some of these opportunities.",
}

@article{church-etal-2022-gft, 
   title={Emerging trends: General fine-tuning (gft)}, 
   DOI={10.1017/S1351324922000237}, 
   journal={Natural Language Engineering}, 
   publisher={Cambridge University Press}, 
   author={Church, Kenneth and Cai, Xingyu and Ying, Yibiao and Chen, Zeyu and Xun, Guangxu and Bian, Yuchen}, 
   year={2022}, 
   pages={1โ€“17}}

gft's People

Contributors

kwchurch avatar yingyibiao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.