GithubHelp home page GithubHelp logo

freedomfromfiat / thruthinkcohereweaviatechat Goto Github PK

View Code? Open in Web Editor NEW

This project forked from csabaconsulting/thruthinkcohereweaviatechat

0.0 0.0 0.0 102 KB

Cohere and Weaviate powered ThruThink support chat on Streamlit

Home Page: https://thruthinksupport.streamlit.app/

License: MIT License

Python 100.00%

thruthinkcohereweaviatechat's Introduction

ThruThink® Support Chat Agent utilizing RAG Fusion, powered by Cohere & Weaviate

The main business goal is to develop a support chat agent for the investment projection web application ThruThink®.

  • ThruThink® is a business budgeting on-line app to create professional budgets and forecasts
  • A product of literally decades of experience and careful thought, and thousands of calculations
  • Thru-hiking, or through-hiking, is the act of hiking an established long-distance trail end-to-end continuously
  • There are no dedicated personnel for support chat agent roles, it had a “classic” chat agent integration in the past
  • An LLM and RAG (Retrieval Augmented Generation) powered chat agent could be invaluable, given that
    1. It stays relatively grounded
    2. It Won’t hallucinate* wildly

Desired abilities:

  • Main goal: answer ThruThink® software specific questions such as: "In ThruThink can I make adjustments on the Cash Flow Control page?"
  • Nice to have: answer more generic questions such as: "How much inventory should I have?"

The bulk of the knowledge base consists of close to 190 help topics also divided into a few dozen categories. That's more than nothing, however users can ask such a wide variety of questions that chunking these documents may not provide a nice ground for a good vector match in the embedding space. To increase the performance of the chat agent I employ several techniques.

Achievements:

  1. With a synthetic data generation I enriched the knowledge base. I coined this QBRAG (QnA Boosted RAG) because I'm using the same QnA data I already generated and curated for potential fine tuning purposes. The same dataset can be used to enrich the vector indexed knowledge as well.
  2. The highlight of my submission is RAG Fusion (see article).
  3. I utilize Weaviate for vector storage, embedding, matching and retrieval. I use Cohere for multiple language models: fine tuned chat model and also co.chat with a web connector advanced feature in different stages of the chain.
  4. I perform metadata helped retrieval since I ingest and index the help documents' titles and categories.
  5. I also use LangChain to construct some stages of the chain.
  6. The front-end is powered by and hosted on Streamlit, I highly customized the view which also features linked references.
  7. After the fusion and re-ranking I provide the user with both results from a more traditional RAG grounded co.chat call and also a web-connector powered other call (that is also augmented to provide guidance) to show both information so the user can get the best of both worlds.
  8. Since I have to control several stages of the chain for the fusion, I was not able to use such high level LangChain constructs as ConversationalRetrievalChain or RetrievalQA, so co.chat's ability to handle the conversation for me (via conversation_id) made my job much easier than I'd have to work for history / memory functionality and other building blocks.

RAG Fusion:

  1. Since users might ask questions which don't match well into the QnA questions in its particular form, but it is still covered by the knowledge base, the application first generates variations of the user's query with the help of a fine tuned Cohere model. The hope is that some of these variations may match closer to some QnA or help data chunks.
  2. The document retrieval then happens for all of the query variations.
  3. There's a reciprocal rank fusion which concludes a fused list of documents across all the variations.
  4. We'll take the top k of those documents and perform final two RAG calls which supply the displayed data. Both RAG calls use the cutting edge co.chat, one of the calls is document based, and the other is a web connector based (but still document augmentation helped for better result).

Note that the application and all the help documents are English. Therefore we used the embed-english-v2.0 with cosine similarity. Knowing we need to perfrom only in English domain we can expect possibly slightly better performance. We need to pay attention to the similarity (multi lingual embedding uses dot product). Also refer to https://github.com/CsabaConsulting/Cohere/blob/main/WeaviateInit.ipynb.

Other achievements:

Kindly look at the development and experimentation IPython notebooks and scripts in the https://github.com/CsabaConsulting/Cohere repository. These were used to establish the Weaviate schema for ingestion / indexing / retrieval, and also testing retrieval and building up the parts for the RAG Fusion.

Future plans:

  • Decrease runtime by running the variation document retrievals in parallel, this is a Streamlit specific tech challenge with asyncio / await.
  • Decrease runtime by running the final two co.chat RAG calls in parallel, this is a Streamlit specific tech challenge with asyncio / await.
  • Make the citation linking nicer and other UI enhancements.
  • Measure how much the RAG Fusion (if at all) improves answer quality. Measure the trade-off factoring in extra latency and potential token usage increase which also means cost increase.
  • Integrate the agent into ThruThink which uses ASP.NET MVC / C# / Azure technology stack, but not open source. In that final deployment I'll be able to open up referred help topics using the meta-data I get back as part of the query results document metadata.
  • Add filter against harmful content, for example using Google PaLM2's safety attributes.

thruthinkcohereweaviatechat's People

Contributors

mrcsabatoth avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.