GithubHelp home page GithubHelp logo

cgnorthcutt / reliablity_framework_for_rag Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 1.0 18.8 MB

Demo showing how the Trustworthy Language Model add reliability to LLM outputs and improves RAG, agents, and data enrichment worfklows. can be used to improve fine-tuning of LLMs, accuracy of LLM outputs, and smart routing for RAG and agents.

Home Page: https://help.cleanlab.ai/tutorials/tlm/

License: GNU Affero General Public License v3.0

Jupyter Notebook 99.99% Python 0.01%
chatgpt data-cleaning data-curation data-observability data-quality llms observability rag

reliablity_framework_for_rag's Introduction

Demo of TLM: The Reliablity Solution for RAG, LLMs, and Data Enrichment

The main file to look at in this repo is the tlm_demo_new.ipynb

News! I added a new data enrichment and LLM reliability demo. Details:

  • Demo showing how Trustworthy Language Model add reliability scores to LLM outputs solving 4 use cases for 4 verticals.
  • expect typos and imperfection. For better results and more details, visit https://help.cleanlab.ai

Hacked this together in a couple hours. Shows how Cleanlab TLM can be used to improve fine-tuning of LLMs, accuracy of LLM outputs, and smart routing for RAG and agents.

image

Dataset used for this example: here.

Base Open AI LLM versus Cleanlab TLM Performance on the public test set

Note these results were run with the fastest version of the TLM (quality_preset="low") for speed reasons (its a hackaathon demo). For improved results, use quality_preset="best".

  • Base Acc (Open-AI GPT-3.5): ~65%

  • TLM Acc: 65.5%

  • TLM Acc (TLM Confidence > 0.3): 66.2%

  • TLM Acc (TLM Confidence > 0.5): 69.9%

  • TLM Acc (TLM Confidence > 0.8): 74.0%

  • Base (Open-AI GPT-3.5) Acc (TLM Confidence < 0.5): 55.1%

If an expert reviews/corrects the 100 samples with lowest TLM confidence score:

  • the resulting accuracy will be: 79%
  • compared to the original base acc: 65%

The TLM (Trustworthy Langauge Model) is available in Cleanlab Studio

There's also a (reduced functionality) demo version available here running on free servers: https://cleanlab.ai/tlm

reliablity_framework_for_rag's People

Contributors

cgnorthcutt avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

fraware

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.