GithubHelp home page GithubHelp logo

abhisheksingh-7 / cotrend Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 1.0 2.91 MB

Extending Decoders with an Integrated Encoder, as Part of Llama-3 Hackathon

Home Page: https://devpost.com/software/cotrend

Makefile 1.97% Python 95.33% Shell 2.70%
contrastive-learning encoder llama-3 retrieval-augmented-generation

cotrend's Introduction

CoTrEnD Logo

Contrastively Trained Encodings from Decoder

Extending Decoders with an Integrated Encoder

This repo holds the code for training encoders that embed the final hidden state from large decoder models. To our knowledge, CoTrEnD is the first architecture to leverage a contrastive loss to train an encoder from a decoder. It was developed as part of the 24h Meta LLAMA-3 hackathon May 2024 by Abhishek Singh, Arthur Böök, and Wian Stipp.

Motivation

The motivation behind the CoTrEnD project is to utilize on the rich hidden states that are generated within large decoders. Rather than separating the embedder from the decoder as one typically would in a RAG approach, CoTrEnD integrates the encoder on top of the decoder. This allows the encoder to leverage the semantic information already captured within the decoder's hidden states.

Architecture

The CoTrEnD architecture is a simple extension of the decoder-only model. The encoder is trained to embed the final hidden state of the decoder. The encoder is trained using a contrastive loss, which encourages the encoder to embed similar hidden states for similar inputs, and dissimilar hidden states for dissimilar inputs.

CoTrEnD Logo

User Interface

The CoTrEnD project includes a user interface that allows users to interact with the model. The user interface is built using Streamlit with two modes of operation.

RAG Mode

The user can ask anything in the question field, and the CoTrEnD model will do a embedding search over the vectorstore to augment the generated answer.

RAG-example

Document Lookup Mode

The user can enter a medical entity in the entity field, and the CoTrEnD model will return the most similar document from the vectorstore.

lookup-example

Team

Abhishek Singh

LinkedIn GitHub Twitter

Arthur Böök

LinkedIn GitHub Twitter

Wian Stipp

LinkedIn GitHub Twitter

cotrend's People

Contributors

abhisheksingh-7 avatar wianstipp avatar arthurbook avatar

Stargazers

 avatar  avatar  avatar

Watchers

Kostas Georgiou avatar  avatar

Forkers

babybirdprd

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.