GithubHelp home page GithubHelp logo

doytsujin / chrome-gpt Goto Github PK

View Code? Open in Web Editor NEW

This project forked from richardyc/chrome-gpt

0.0 1.0 0.0 140 KB

An AutoGPT agent that controls Chrome on your desktop

License: GNU General Public License v3.0

Python 99.04% Makefile 0.96%

chrome-gpt's Introduction

๐Ÿค– Chrome-GPT: An experimental AutoGPT agent that interacts with Chrome

lint test Twitter

Chrome-GPT is an AutoGPT experiment that utilizes Langchain and Selenium to enable an AutoGPT agent take control of an entire Chrome session. With the ability to interactively scroll, click, and input text on web pages, the AutoGPT agent can navigate and manipulate web content.

๐Ÿ–ฅ๏ธ Demo

Input Prompt: Find me a bar that can host a 20 person event near Chelsea, Manhattan evening of Apr 30th. Fill out contact us form if they have one with info: Name Richard, email [email protected].

DEMO.mov

Demo made by Richard He

๐Ÿ”ฎ Features

  • ๐ŸŒŽ Google search
  • ๐Ÿง  Long-term and short-term memory management
  • ๐Ÿ”จ Chrome actions: describe a webpage, scroll to element, click on buttons/links, input forms, switch tabs
  • ๐Ÿค– Supports multiple agent types: Zero-shot, BabyAGI and Auto-GPT
  • ๐Ÿ”ฅ (IN PROGRESS) Chrome plugin support

๐Ÿงฑ Known Limitations

  • There are limited web crawling features, with buttons and input fields sometimes failing to appear in prompt.
  • The response time is slow, with each action taking between 1-10 seconds to run.
  • At times, langchain agents are unable to parse GPT outputs (refer to langchain discussion: langchain-ai/langchain#4065).

Requirements

  • Chrome
  • Python >3.8
  • Install Poetry

๐Ÿ› ๏ธ Setup

  1. Set up your OpenAI API Keys and add OPENAI_API_KEY env variable
  2. Install Python requirements via poetry poetry install
  3. Open a poetry shell poetry shell
  4. Run chromegpt via python -m chromegpt

๐Ÿง  Usage

  • GPT-3.5 Usage (Default): python -m chromegpt -v -t "{your request}"
  • GPT-4 Usage (Recommended, needs GPT-4 access): python -m chromegpt -v -a auto-gpt -m gpt-4 -t "{your request}"
  • For help: python -m chromegpt --help
Usage: python -m chromegpt [OPTIONS]

  Run ChromeGPT: An AutoGPT agent that interacts with Chrome

Options:
  -t, --task TEXT                 The task to execute  [required]
  -a, --agent [auto-gpt|baby-agi|zero-shot]
                                  The agent type to use
  -m, --model TEXT                The model to use
  --headless                      Run in headless mode
  -v, --verbose                   Run in verbose mode
  --human-in-loop                 Run in human-in-loop mode, only available
                                  when using auto-gpt agent
  --help                          Show this message and exit.

chrome-gpt's People

Contributors

richardyc avatar xayaraj avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.