GithubHelp home page GithubHelp logo

vml_examples's Introduction

Verbalized Machine Learning -- Examples

Arxiv License: MIT


This repository provides toy examples demonstrating the concept of Verbalized Machine Learning (VML) introduced by the paper:

Verbalized Machine Learning: Revisiting Machine Learning with Language Models
Tim Z. Xiao, Robert Bamler, Bernhard Schölkopf, Weiyang Liu
Paper: https://arxiv.org/abs/2406.04344

VML introduces a new framework of machine learning. Unlike conventional machine learning models that are typically optimized over a continuous parameter space, VML constrains the parameter space to be human-interpretable natural language. Such a constraint leads to a new perspective of function approximation, where an LLM with a text prompt can be viewed as a function parameterized by the text prompt.

Many classical machine learning problems can be solved under this new framework using an LLM-parameterized learner and optimizer. The major advantages of VML include:

  1. Easy encoding of inductive bias: prior knowledge about the problem and hypothesis class can be encoded in natural language and fed into the LLM-parameterized learner.
  2. Automatic model class selection: the optimizer can automatically select a concrete model class based on data and verbalized prior knowledge, and it can update the model class during training.
  3. Interpretable learner updates: the LLM-parameterized optimizer can provide explanations for why each learner update is performed.

TODO

  • Tutorial: Colab hands-on with linear regression
  • Exp: Regression examples
    • Linear
    • Polynormial
    • Sine
  • Exp: Classification examples
    • 2D plane
    • Medical Image (PneumoniaMNIST)

Environment

Python 3.10

Other dependencies are in requirements.txt

Step 1 - Setup LLMs Endpoint

VML uses pretrained LLMs as excution engines. Hence, we need to have access to an LLM endpoint. This can be done through either the OpenAI endpoint (if you have an account), or open-source models such as Llama.

(Of cource, you can also manually copy/paste the entire prompt into ChatGPT website to have a quick tryout without setting up the endpoints.)

(a) OpenAI Endpoint

To use LLMs service provided by OpenAI, you can copy your OpenAI API key to the variable OPENAI_API_KEY.

(b; alternatively) Local Endpoint: Start the vLLM API server in a separate terminal

vLLM provides an easy and fast inference engine for many open-source LLMs including Llama. After you install vLLM, you can start a Llama API server using the following command. vLLM uses the same API interface as OpenAI.

python -m vllm.entrypoints.openai.api_server \
--model <HUGGINGFACE_MODEL_DIR> \
--dtype auto \
--api-key token-abc123 \
--tensor-parallel-size <NUMBER_OF_GPU>

Step 2: VML Quickstart

Train Regression Example Command

python regression.py \
--model "llama" \
--task "linear_regression" \
--batch_size 10 \
--eval_batch_size 100 \
--epochs 5 

Citation: Bibtex for VML

Following is the Bibtex for the VML paper:

@article{xiao2024verbalized,
  title = {Verbalized Machine Learning: Revisiting Machine Learning with Language Models},
  author = {Xiao, Tim Z. and Bamler, Robert and Schölkopf, Bernhard and Liu, Weiyang},
  journal = {arXiv preprint arXiv:2406.04344},
  year = {2024},
}

Contributing to this repo

We welcome the community to submit pull request for any new example of VML into this repo! We hope this repo provides interesting examples of VML and inspires new ideas for future LLMs research!

vml_examples's People

Contributors

timxzz avatar

Stargazers

Johannes Zenn avatar Gurumurthi V Ramanan avatar carlo avatar Jinke He avatar  avatar Yuliang Xiu avatar Weiyang Liu avatar

Watchers

 avatar Kostas Georgiou avatar

Forkers

mardom

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.