GithubHelp home page GithubHelp logo

automix-llm / automix Goto Github PK

View Code? Open in Web Editor NEW
92.0 3.0 7.0 1.75 MB

Mixing Language Models with Self-Verification and Meta-Verification

License: Apache License 2.0

Jupyter Notebook 90.89% Python 9.11%
few-shot-learning large-language-models model-selection prompting question-answering

automix's Introduction

AutoMix: Automatically Mixing Language Models



What is AutoMix?

The idea behind AutoMix is simple:

  1. Send a query to small language model (SLM), gets a noisy label on its correctness using few-shot self-verification done with the same model (SLM).

  2. Use a meta-verifier to double check verifier's output, and route the query to a larger language model (LLM) if needed.

Self-Verification and Meta-verification

At the center of automix is the idea of context-grounded self-verification:

  • However, such verification can often be noisy, so we introduce an additional layer of meta-verification using POMDPs or thresholding.

Notebooks

Running inference

Few-shot self-verification

  • Step2 Self Verify - Verification prompts, code to run verification on the outputs produced in step 1. Open In Colab

Meta-verification

  • Step3 Meta Verify - Run meta-verification using different AutoMix methods on outputs produced from Step 2. Open In Colab

  • You can run `pip install automix-llm' to use the meta-verifier system wide.

Replicating the results

  • To replicate the results in the paper, please run python scripts paper_results.py

Data and Outputs

  • We experiment with 5 datasets: CNLI, CoQA, NarrativeQA, QASPER, and Quality.

  • Note: The dataset are sourced from scrolls. Please cite scrolls and the appropriate sources if you use these datasets. We are making them available in a sinlge jsonl file for ease of use and reproducibility. For details on how CoQa was prepared, please see Preparing COQA.

  • Inputs: All input data for the AutoMix project is provided in automix_inputs.jsonl. You can access and download it directly from Google Drive.

  • Outputs from LLAMA2: The outputs generated using the LLAMA2 model are stored in automix_llama2_outputs.jsonl, available alongside the input file in the linked Google Drive.

id: A unique identifier for each question and answer pair.
pid: An additional identifier potentially mapping to specific instances or model variants.
base_ctx: The context.
question: Input question or query.
output: Ground truth.
dataset: .
llama13b_pred_ans: The answer generated by the llama13b model.
llama70b_pred_ans: The answer generated by the llama70b model.
llama13b_ver: Verification outputs of the llama13b model’s answers.

Stats

--------------------------------
| Dataset      | Split | Count |
|--------------|-------|-------|
| cnli         | train | 7191  |
|              | val   | 1037  |
| coqa         | train | 3941  |
|              | val   | 3908  |
| narrative_qa | train | 9946  |
|              | val   | 5826  |
| qasper       | train | 2556  |
|              | val   | 1715  |
| quality      | train | 2515  |
|              | val   | 2085  |
--------------------------------
Name: split, dtype: int64

Citation

@misc{madaan2023automix,
      title={AutoMix: Automatically Mixing Language Models}, 
      author={Aman Madaan and Pranjal Aggarwal and Ankit Anand and Srividya Pranavi Potharaju and Swaroop Mishra and Pei Zhou and Aditya Gupta and Dheeraj Rajagopal and Karthik Kappaganthu and Yiming Yang and Shyam Upadhyay and Mausam and Manaal Faruqui},
      year={2023},
      eprint={2310.12963},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

automix's People

Contributors

eltociear avatar madaan avatar pranjal2041 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

automix's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.