GithubHelp home page GithubHelp logo

craftjarvis / rat Goto Github PK

View Code? Open in Web Editor NEW
143.0 5.0 18.0 7.72 MB

Implementation of "RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation".

Jupyter Notebook 49.67% Python 50.33%

rat's Introduction

RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation

Abstract

We explore how iterative revising a chain of thoughts with the help of information retrieval significantly improves large language models' reasoning and generation ability in long-horizon generation tasks, while hugely mitigating hallucination. In particular, the proposed method — retrieval-augmented thoughts (RAT) — revises each thought step one by one with retrieved information relevant to the task query, the current and the past thought steps, after the initial zero-shot CoT is generated.

Prerequisites

  • Python packages can be installed with pip install -r requirements.txt

  • OpenAI_API_key is required for the language model. You can get it from OpenAI.

export OPENAI_API_KEY="sk-******"
  • You can also use the Huggingface API key for the language model. You can get it from Huggingface.

  • You also need to prepare the GOOGLE_API_KEY for Google Search API. You can get it from Google Cloud Platform.

export GOOGLE_API_KEY="********"

Getting Started

You can run the ipython notebook named creative.ipynb to generate open-ended creative text with RAT.

You can also use the following code to start a Gradio interface to interact with RAT:

python app/gradio_app.py

Experimental Results

Applying RAT to various base models substantially improves their performances on various long-horizon generation tasks; on average of relatively increasing rating scores by 13.63% on code generation, 16.96% on mathematical reasoning, 19.2% on creative writing, and 42.78% on embodied task planning.

Check out our paper!

Our paper is available on Arxiv. Please cite our paper if you find RAT useful for your research:

@article{wang2024rat,
    author    = {Zihao, Wang and Anji, Liu and Haowei, Lin and Jiaqi, Li and Xiaojian, Ma and Yitao, Liang},
    title     = {RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation},
    journal   = {arXiv preprint arXiv: 2403.05313},
    year      = {2024},
}

rat's People

Contributors

zhwang4ai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

rat's Issues

Can you provide a code generation case?

I have a question regarding the task of code generation. Specifically, for a task that ultimately requires generating code, are the "thoughts" generated in RAT textual descriptions or actual code? According to the prompt provided in the paper, the results show that the thoughts are a series of textual descriptions. How are these thoughts ultimately transformed into code?

what is the retrieval source for GSM8K?

Hello, your work has left a deep impression on me.

It's mentioned in your paper that the codeparrot/github-jupyter dataset are employed as your primary search vector library for code generation and mathematical reasoning tasks. But isn't this dataset only used for the code generation task? So, may I ask what the retrieval knowledge source for GSM8K is? Or is it indeed possible to use codeParrot/gitHub-jupyter for retrieval in the GSM8K task?

Looking forward to your reply.

a long time to execute

I tried the code you propose, but it takes a long time to execute (after 20 minutes I canceled it). I don't know the reason, maybe I am using different versions of the libraries.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.