GithubHelp home page GithubHelp logo

ckobus / farm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from deepset-ai/farm

0.0 0.0 0.0 5 MB

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry.

Home Page: https://farm.deepset.ai

License: Apache License 2.0

Python 92.47% Dockerfile 0.09% Jupyter Notebook 7.44%

farm's Introduction

FARM LOGO

(Framework for Adapting Representation Models)

Build

Release

License

Last Commit

Last Commit

What is it?

FARM makes cutting edge Transfer Learning for NLP simple. Building upon transformers, FARM is a home for all species of pretrained language models (e.g. BERT) that can be adapted to different domain languages or down-stream tasks. With FARM you can easily create SOTA NLP models for tasks like document classification, NER or question answering. The standardized interfaces for language models and prediction heads allow flexible extension by researchers and easy application for practitioners. Additional experiment tracking and visualizations support you along the way to adapt a SOTA model to your own NLP problem and have a fast proof-of-concept.

Core features

  • Easy adaptation of language models (e.g. BERT) to your own use case
  • Fast integration of custom datasets via Processor class
  • Modular design of language model and prediction heads
  • Switch between heads or just combine them for multitask learning
  • Smooth upgrading to new language models
  • Powerful experiment tracking & execution
  • Simple deployment and visualization to showcase your model
Task BERT RoBERTa XLNet ALBERT
Text classification x

x

x

x

NER x

x

x

x

Question Answering x

x

x

x

Language Model Fine-tuning x
Text Regression x

x

x

x

Multilabel Text classif. x

x

x

x

Demo

NEW: Checkout https://demos.deepset.ai to play around with some models

Resources

Installation

Recommended (because of active development):

git clone https://github.com/deepset-ai/FARM.git
cd FARM
pip install -r requirements.txt
pip install --editable .

If problems occur, please do a git pull. The --editable flag will update changes immediately.

From PyPi:

pip install farm

Basic Usage

1. Train a downstream model

FARM offers two modes for model training:

Option 1: Run experiment(s) from config

image

Use cases: Training your first model, hyperparameter optimization, evaluating a language model on multiple down-stream tasks.

Option 2: Stick together your own building blocks

image

Usecases: Custom datasets, language models, prediction heads ...

Metrics and parameters of your model training get automatically logged via MLflow. We provide a public MLflow server for testing and learning purposes. Check it out to see your own experiment results! Just be aware: We will start deleting all experiments on a regular schedule to ensure decent server performance for everybody!

2. Run Inference (API + UI)

FARM Inferennce UI

One docker container exposes a REST API (localhost:5000) and another one runs a simple demo UI (localhost:3000). You can use both of them individually and mount your own models. Check out the docs for details.

Core concepts

Model

AdaptiveModel = Language Model + Prediction Head(s) With this modular approach you can easily add prediction heads (multitask learning) and re-use them for different types of language model. (Learn more)

image

Data Processing

Custom Datasets can be loaded by customizing the Processor. It converts "raw data" into PyTorch Datasets. Much of the heavy lifting is then handled behind the scenes to make it fast & simple to debug. (Learn more)

image

Upcoming features

  • AWS SageMaker support (incl. Spot instances)
  • Training from Scratch
  • Support for more Question Answering styles and datasets
  • Additional visualizations and statistics to explore and debug your model
  • Enabling large scale deployment for production
  • Simpler benchmark models (fasttext, word2vec ...)

Acknowledgements

  • FARM is built upon parts of the great transformers repository from Huggingface. It utilizes their implementations of the BERT model and Tokenizer.
  • The original BERT model and paper was published by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.

Citation

As of now there is no published paper on FARM. If you want to use or cite our framework, please include the link to this repository. If you are working with the German Bert model, you can link our blog post describing its training details and performance.

farm's People

Contributors

tholor avatar timoeller avatar brandenchan avatar tanaysoni avatar ahmedidr avatar johann-petrak avatar busyxin avatar tripl3a avatar cregouby avatar maknotavailable avatar jinnerbichler avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.