GithubHelp home page GithubHelp logo

julesbelveze / bert-sequence-classifier Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 0.0 58.5 MB

๐Ÿค— Dockerized BERT-Multi-Label-Classifier Inferer ๐Ÿค—

Jupyter Notebook 98.29% Python 1.69% Dockerfile 0.02%
bert classification multi-label-classification transformers roberta distilbert docker inference api toxicity huggingface

bert-sequence-classifier's Introduction

๐Ÿค— BERT-Multi-Label-Classifier / Dockerized Inferer ๐Ÿค—

Repository to fine-tune a BERT-base multi-label/multi-class classifier, based on HuggingFace library. The repository includes a Flask API wrapper for inference.

Table of contents

Installation

To install the repository please run the following command:

git clone https://github.com/JulesBelveze/BERT-multi-label-classifier.git

The repository uses Poetry as a package manager (see full documentation here). To install the required packages please run the following commands:

python3 -m venv .venv/bert-mlc
source .venv/bert-mlc/bin/activate
poetry install

This repo uses neptune.ai to manage experiments. We invite you to look at their documentation if needed.

Organisation of files

  • models/: folder containing custom models
  • utils/: folder containing function utilities
  • main.py: main file to run
  • train.py: file containing the training procedure
  • eval.py: file containing the evaluation procedure
  • app.py: file containing the Flask app
  • inferer.py: file containing the model inferer
  • poetry.lock: Poetry file
  • pyproject.toml: Poetry file
  • requirements_inference.txt: required packages for inference
  • Dockerfile: file to run the API as a docker image

Datasets

Models

We provide customisation of four different models: BERT, Roberta, XLMRoberta and Distilbert.

1. Multi-label-classifier

The model is an adaptation of the BertForSequenceClassification model of HuggingFace to handle multi-label. The key modification here is the modification of loss function.

2. Multi-class-classifier

The model used is basically a MLP on top of a BERT model. Once again, the custom model provided extends the BertForSequenceClassification model of HuggingFace to integrate the class weights in the loss function.

Inference

The inferrer only supports single input inference. It handles all the processing steps required to feed the text into the classification model. It can be used in the following way:

model_infer = ModelInferer(config=config, checkpoint_path=checkpoint_path, quantize=True)
model_infer.predict("I hate you from more than you can imagine")

We also provide a Flask API that encapsulates the inferrer as well as a way Dockerized the app for production usage.

bert-sequence-classifier's People

Contributors

dependabot[bot] avatar julesbelveze avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

bert-sequence-classifier's Issues

TODO

  • test models loading checkpoint
  • add inferer
  • dockerize the inferer
  • add documentation
  • add hf custom dataset
  • look at continuing pretraining

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.