GithubHelp home page GithubHelp logo

llm-agent-survey's Introduction

A Survey on LLM-based Autonomous Agents

Growth Trend

Autonomous agents are designed to achieve specific objectives through self-guided instructions. With the emergence and growth of large language models (LLMs), there is a growing trend in utilizing LLMs as fundamental controllers for these autonomous agents. While previous studies in this field have achieved remarkable successes, they remain independent proposals with little effort devoted to a systematic analysis. To bridge this gap, we conduct a comprehensive survey study, focusing on the construction, application, and evaluation of LLM-based autonomous agents. In particular, we first explore the essential components of an AI agent, including a profile module, a memory module, a planning module, and an action module. We further investigate the application of LLM-based autonomous agents in the domains of natural sciences, social sciences, and engineering. Subsequently, we delve into a discussion of the evaluation strategies employed in this field, encompassing both subjective and objective methods. Our survey aims to serve as a resource for researchers and practitioners, providing insights, related references, and continuous updates on this exciting and rapidly evolving field.

πŸ“ This is the first released and published survey paper in the field of LLM-based autonomous agents.

Paper link: A Survey on Large Language Model based Autonomous Agents

Update Records

  • πŸ”₯ [25/3/2024] Our survey paper has been accepted by Frontiers of Computer Science, which is the first published survey paper in the field of LLM-based agents.

  • πŸ”₯ [9/28/2023] We have compiled and summarized papers related to LLM-based Agents that have been accepted by Neurips 2023 in the repository LLM-Agent-Paper-Digest. This repository will continue to be updated with accepted agent-related papers in the future.

  • πŸ”₯ [9/8/2023] The second version of our survey has been released on arXiv.

    Updated contents
    • πŸ“š Additional References

      • We have added 31 new works until 9/1/2023 to make the survey more comprehensive and up-to-date.
    • πŸ“Š New Figures

      • Figure 3: This is a new figure illustrating the differences and similarities between various planning approaches. This helps in gaining a clearer understanding of the comparisons between different planning methods. single-path and multi-path reasoning
      • Figure 4: This is a new figure that describes the evolutionary path of model capability acquisition from the "Machine Learning era" to the "Large Language Model era" and then to the "Agent era." Specifically, a new concept, "mechanism engineering," has been introduced, which, along with "parameter learning" and "prompt engineering," forms part of this evolutionary path. Capabilities Acquisition
    • πŸ” Optimized Classification System

      • We have slightly modified the classification system in our survey to make it more logical and organized.
  • πŸ”₯ [8/23/2023] The first version of our survey has been released on arXiv.

Table of Content

πŸ€– Construction of LLM-based Autonomous Agent

Architecture Design

Model Profile Memory Planning Action CA Paper Code
Operation Structure
WebGPT - - - - w/ tools w/ fine-tuning Paper -
SayCan - - - w/o feedback w/o tools w/o fine-tuning Paper Code
MRKL - - - w/o feedback w/ tools - Paper -
Inner Monologue - - - w/ feedback w/o tools w/o fine-tuning Paper Code
Social Simulacra GPT-Generated - - - w/o tools - Paper -
ReAct - - - w/ feedback w/ tools w/ fine-tuning Paper Code
LLM Planner - - - w/ feedback w/o tools Environment feedback Paper Code
MALLM - Read/Write Hybrid - w/o tools - Paper -
aiflows - Read/Write/
Reflection
Hybrid w/ feedback w/ tools - Paper Code
DEPS - - - w/ feedback w/o tools w/o fine-tuning Paper Code
Toolformer - - - w/o feedback w/ tools w/ fine-tuning Paper Code
Reflexion - Read/Write/
Reflection
Hybrid w/ feedback w/o tools w/o fine-tuning Paper Code
CAMEL Handcrafting & GPT-Generated - - w/ feedback w/o tools - Paper Code
API-Bank - - - w/ feedback w/ tools w/o fine-tuning Paper -
Chameleon - - - w/o feedback w/ tools - Paper Code
ViperGPT - - - - w/ tools - Paper Code
HuggingGPT - - Unified w/o feedback w/ tools - Paper Code
Generative Agents Handcrafting Read/Write/
Reflection
Hybrid w/ feedback w/o tools - Paper Code
LLM+P - - - w/o feedback w/o tools - Paper -
ChemCrow - - - w/ feedback w/ tools - Paper Code
OpenAGI - - - w/ feedback w/ tools w/ fine-tuning Paper Code
AutoGPT - Read/Write Hybrid w/ feedback w/ tools w/o fine-tuning - Code
SCM - Read/Write Hybrid - w/o tools - Paper Code
Socially Alignment - Read/Write Hybrid - w/o tools Example Paper Code
GITM - Read/Write/
Reflection
Hybrid w/ feedback w/o tools w/ fine-tuning Paper Code
Voyager - Read/Write/
Reflection
Hybrid w/ feedback w/o tools w/o fine-tuning Paper Code
Introspective Tips - - - w/ feedback w/o tools w/o fine-tuning Paper -
RET-LLM - Read/Write Hybrid - w/o tools w/ fine-tuning Paper -
ChatDB - Read/Write Hybrid w/ feedback w/ tools - Paper -
S3 Dataset alignment Read/Write/
Reflection
Hybrid - w/o tools w/ fine-tuning Paper -
ChatDev Handcrafting Read/Write/
Reflection
Hybrid w/ feedback w/o tools w/o fine-tuning Paper Code
ToolLLM - - - w/ feedback w/ tools w/ fine-tuning Paper Code
MemoryBank - Read/Write/
Reflection
Hybrid - w/o tools - Paper Code
MetaGPT Handcrafting Read/Write/
Reflection
Hybrid w/ feedback w/ tools - Paper Code
L2MAC Handcrafting Read/Write/
Reflection
Hybrid w/ feedback w/ tools - Paper Code
LEO - - - w/ feedback w/o tools w/ fine-tuning Paper Code
JARVIS-1 - Read/Write/
Reflection
Hybrid w/ feedback w/ tools w/o fine-tuning Paper Code
CLOVA - Read/Write/
Reflection
Hybrid w/ feedback w/ tools w/ fine-tuning Paper Code
LearnAct - - - w/ feedback w/ tools w/ fine-tuning Paper Code

πŸ“ Applications of LLM-based Autonomous Agent

Title Social Science Natural Science Engineering Paper Code
Drori et al. - Science Education - Paper -
SayCan - - Robotics & Embodied AI Paper Code
Inner monologue - - Robotics & Embodied AI Paper Code
Language-Planners - - Robotics & Embodied AI Paper Code
Social Simulacra Social Simulation - - Paper -
TE Psychology - - Paper Code
Out of One Political Science and Economy - - Paper -
LIBRO CS&SE - - Paper -
Blind Judgement Jurisprudence - - Paper -
Horton Political Science and Economy - - Paper -
DECKARD - - Robotics & Embodied AI Paper Code
Planner-Actor-Reporter - - Robotics & Embodied AI Paper -
DEPS - - Robotics & Embodied AI Paper -
RCI - - CS&SE Paper Code
Generative Agents Social Simulation - - Paper Code
SCG - - CS&SE Paper -
IGLU - - Civil Engineering Paper -
IELLM - - Industrial Automation Paper -
ChemCrow - Document and Data Management;
Documentation, Data Managent;
Science Education
- Paper -
Boiko et al. - Document and Data Management;
Documentation, Data Managent;
Science Education
- Paper -
GPT4IA - - Industrial Automation Paper Code
Self-collaboration - - CS&SE Paper -
E2WM - - Robotics & Embodied AI Paper Code
Akata et al. Psychology - - Paper -
Ziems et al. Psychology;
Political Science and Economy;
Research Assistant
- - Paper -
AgentVerse Social Simulation - - Paper Code
SmolModels - - CS&SE - Code
TidyBot - - Robotics & Embodied AI Paper Code
PET - - Robotics & Embodied AI Paper -
Voyager - - Robotics & Embodied AI Paper Code
GITM - - Robotics & Embodied AI Paper Code
NLSOM - Science Education - Paper -
LLM4RL - - Robotics & Embodied AI Paper -
GPT Engineer - - CS&SE - Code
Grossman et al. - Experiment Assistant;
Science Education
- Paper -
SQL-PALM - - CS&SE Paper -
REMEMBER - - Robotics & Embodied AI Paper -
DemoGPT - - CS&SE - Code
Chatlaw Jurisprudence - - Paper Code
RestGPT - - CS&SE Paper Code
Dialogue shaping - - Robotics & Embodied AI Paper -
TaPA - - Robotics & Embodied AI Paper -
Ma et al. Psychology - - Paper -
Math Agents - Science Education - Paper -
SocialAI School Social Simulation - - Paper -
Unified Agent - - Robotics & Embodied AI Paper -
Wiliams et al. Social Simulation - - Paper -
Li et al. Social Simulation - - Paper -
S3 Social Simulation - - Paper -
Dialogue Shaping - - Robotics & Embodied AI Paper -
RoCo - - Robotics & Embodied AI Paper Code
Sayplan - - Robotics & Embodied AI Paper Code
aiflows - - CS & SE Paper Code
ToolLLM - - CS&SE Paper Code
ChatDEV - - CS&SE Paper -
Chao et al. Social Simulation - - Paper -
AgentSims Social Simulation - - Paper Code
ChatMOF - Document and Data Management;
Science Education
- Paper -
MetaGPT - - CS&SE Paper Code
L2MAC - - CS&SE Paper Code
Codehelp - Science Education CS&SE Paper -
AutoGen - Science Education - Paper -
RAH - - CS&SE Paper -
DB-GPT - - CS&SE Paper Code
RecMind - - CS&SE Paper -
ChatEDA - - CS&SE Paper -
InteRecAgent - - CS&SE Paper -
PentestGPT - - CS&SE Paper -
Codehelp - - CS&SE Paper -
ProAgent - - Robotics & Embodied AI Paper -
MindAgent - - Robotics & Embodied AI Paper -
LEO - - Robotics & Embodied AI Paper -
JARVIS-1 - - Robotics & Embodied AI Paper -
CLOVA - - CS&SE Paper -
AgentTrust - Social Simulation βœ“ Paper Code -

πŸ“Š Evaluation on LLM-based Autonomous Agent

Model Subjective Objective Benchmark Paper Code
WebShop - Environment Simulation;
Multi-task Evaluation
βœ“ Paper Code
Social Simulacra Human Annotation Social Evaluation - Paper -
TE - Social Evaluation - Paper Code
LIBRO - Software Testing - Paper -
ReAct - Environment Simulation βœ“ Paper Code
Out of One, Many Turing Test Social Evaluation;
Multi-task Evaluation
- Paper -
DEPS - Environment Simulation βœ“ Paper -
Jalil et al. - Software Testing - Paper Code
Reflexion - Environment Simulation;
Multi-task Evaluation
- Paper Code
IGLU - Environment Simulation βœ“ Paper -
Generative Agents Human Annoation;
Turing Test
- - Paper Code
ToolBench Human Annoation Multi-task Evalution βœ“ Paper Code
GITM - Environment Simulation βœ“ Paper Code
Two-Failures - Multi-task Evalution - Paper -
Voyager - Environment Simulation βœ“ Paper Code
SocKET - Social Evaluation;
Multi-task Evaluation
βœ“ Paper -
Mobile-Env - Environment Simulation;
Multi-task Evaluation
βœ“ Paper Code
Clembench - Environment Simulation;
Multi-task Evaluation
βœ“ Paper Code
Mind2Web - Environment Simulation;
Multi-task Evaluation
βœ“ 06/2023 Paper Code
Dialop - Social Evaluation βœ“ Paper Code
Feldt et al. - Software Testing - Paper -
CO-LLM Human Annoation Environment Simulation - Paper Code
Tachikuma Human Annoation Environment Simulation βœ“ Paper -
WebArena - Environment Simulation βœ“ Paper Code
RocoBench - Environment Simulation;
Social Evaluation;
Multi-task Evaluation
βœ“ Paper Code
AgentSims - Social Evaluation - Paper Code
AgentBench - Multi-task Evaluation βœ“ Paper Code
BOLAA - Environment Simulation;
Multi-task Evaluation;
Software Testing
βœ“ Paper Code
Gentopia - Isolated Reasoning;
Multi-task Evaluation
βœ“ Paper Code
EmotionBench Human Annotation - βœ“ Paper Code
PTB - Software Testing βœ“ Paper -
MintBench - Multi-task Evaluation βœ“ Paper Code
MindAgent - Environment Simulation;
Multi-task Evaluation
βœ“ Paper -
JARVIS-1 - Environment Simulation - Paper -

🌐 More Comprehensive Summarization

We are maintaining an interactive table that contains more comprehensive papers related to LLM-based Agents. This table includes details such as tags, authors, publication date, and more, allowing you to sort, filter, and find the papers of interest to you. Complete Table

πŸ‘¨β€πŸ‘¨β€πŸ‘§β€πŸ‘¦ Maintainers

πŸ“š Citation

If you find this survey useful, please cite our paper:

@misc{wang2023survey,
      title={A Survey on Large Language Model based Autonomous Agents}, 
      author={Lei Wang and Chen Ma and Xueyang Feng and Zeyu Zhang and Hao Yang and Jingsen Zhang and Zhiyuan Chen and Jiakai Tang and Xu Chen and Yankai Lin and Wayne Xin Zhao and Zhewei Wei and Ji-Rong Wen},
      year={2023},
      eprint={2308.11432},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

πŸ’ͺ How to Contribute

If you have a paper or are aware of relevant research that should be incorporated, please contribute via pull requests, issues, email, or other suitable methods.

🫑 Acknowledgement

We thank the following people for their valuable suggestions and contributions to this survey:

πŸ“§ Contact Us

If you have any questions or suggestions, please contact us via:

llm-agent-survey's People

Contributors

boyuanzheng010 avatar eltociear avatar gearsuccess avatar jeasinema avatar paitesanshi avatar samholt avatar wz1211 avatar xingyaoww avatar xueyangfeng avatar yilu114 avatar yushengsu-thu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

llm-agent-survey's Issues

ε€šζ™Ίθƒ½δ½“

θƒ½ε¦ε’žεŠ δΈ€δΈͺδΏ‘ζ―οΌŒζ˜―ε¦δΈΊε€šζ™Ίθƒ½δ½“θΏ˜ζ˜―ε•ζ™Ίθƒ½δ½“

I'd like to share recent work "Empowering Large Language Model Agents through Action Learning"

Hello,

Thanks for your comprehensive and inspiring paper list! I'd like to share our recent work titled "Empowering Large Language Model Agents through Action Learning," which may be of interest to the paper list readers. The paper may be added to the Planning Section.

Paper: https://arxiv.org/abs/2402.15809
Code: https://github.com/zhao-ht/LearnAct
This work proposes the LearnAct framework, which employs an iterative learning approach to dynamically create and refine learnable actions (skills). By evaluating and amending actions in response to errors observed during unsuccessful training episodes, LearnAct systematically increases the efficiency and adaptability of actions undertaken by Large Language Model (LLM) agents.
The experiment conducted within the contexts of Robotic Planning and Alfworld environments demonstrated that LearnAct can significantly enhance agent performance on given tasks.

I hope this contributes to the great paper list!

Equation 1

Thank you for providing this comprehensive and outstanding survey.

Is it possible that "argmax" should be used instead of "argmin" in Equation 1 on page 6?

One reference on LLM Agents playing Trust Games

Congratulations on your recent solid survey paper and impressive paper list!

We have a related paper on LLM Agents playing Trust Games.

Can Large Language Model Agents Simulate Human Trust Behaviors?

  • arxiv : https://arxiv.org/abs/2402.04559
  • code : https://github.com/camel-ai/agent-trust
  • project website : https://www.camel-ai.org/research/agent-trust
  • We discover the trust behaviors of LLM agents under the framework of Trust Games, and the high behavioral alignment between LLM agents and humans regarding the trust behaviors, particularly for GPT-4, indicating the feasibility to simulate human trust behaviors with LLM agents.
  • abstract: Large Language Model (LLM) agents have been increasingly adopted as simulation tools to model humans in applications such as social science. However, one fundamental question remains: can LLM agents really simulate human behaviors? In this paper, we focus on one of the most critical behaviors in human interactions, trust, and aim to investigate whether or not LLM agents can simulate human trust behaviors. We first find that LLM agents generally exhibit trust behaviors, referred to as agent trust, under the framework of Trust Games, which are widely recognized in behavioral economics. Then, we discover that LLM agents can have high behavioral alignment with humans regarding trust behaviors, particularly for GPT-4, indicating the feasibility to simulate human trust behaviors with LLM agents. In addition, we probe into the biases in agent trust and the differences in agent trust towards agents and humans. We also explore the intrinsic properties of agent trust under conditions including advanced reasoning strategies and external manipulations. We further offer important implications of our discoveries for various scenarios where trust is paramount. Our study provides new insights into the behaviors of LLM agents and the fundamental analogy between LLMs and humans.

Introducing a new paper on role-playing LLM agents (ACL 2024 Findings)

Hi, what a fantastic resource for developing intelligent LLM agents!

I wanted to highlight a recent paper presented at ACL 2024 Findings: TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models.
This study focuses on assessing hallucinations in role-playing LLM agents when they simulate characters at specific moments in time.

We would greatly appreciate it if you could consider adding our paper to your survey.
Thanks!

Update Chart

Hi, I was wondering if you have an updated list of LLM papers? You have a really nice chart that goes until August 2023 but it would be great to have an updated version or at least a list of all LLM papers by date. Do you have this?

APi-Bank URLs are dead

The URLs for paper "API-Bank" are missing.

Currently they point to https://github.com/Paitesanshi/LLM-Agent-Survey/blob/main/url

citation version

I have a quick question regarding your citations.
I've noticed that a significant proportion of your citations (over 90%) are the arXiv version of papers.
But many of them have already been published.
I'm curious about why.

[165] reference

First, I really thank you for the contribution that you guys have made. I am still reading it. BTW, no offense, I notice some commas are missing in the [165] reference.:joy:

Introducing our NeurIPS 2023 paper

Hi!

This list is an invaluable resource in the area of building intelligent agents with LLMs.

I wanted to take a moment to bring your attention to a recent NeurIPS-23 paper from our lab: Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning. Instead of getting plans from LLMs directly, it allows the agent to use external planners to reliably search for plans (somewhat in a similar vein to tool-augmented LLMs).

We would be grateful if you would consider including our papers in your survey. We believe it would greatly benefit the readers interested in this burgeoning area of LLM-driven intelligent agents.

Best regards

Missing Related Work

Dear Authors,

Thank you for your efforts in proposing this survey paper.

We are the authors of β€œChameleon,” a framework designed to seamlessly integrate LLM agents with various external tools (https://arxiv.org/abs/2304.09842, https://github.com/lupantech/chameleon-llm). Since its release in April 2023, our work has attracted significant attention from the AI research community.

We would be honored if you could include our work in the github repo (and a discussion in the next revision of your paper if it is possible). We believe that our framework complements the discussions in your work and could offer additional insights to the readers.

We thank you for considering our request and look forward to your positive response.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.