GithubHelp home page GithubHelp logo

richardyc / chrome-gpt Goto Github PK

View Code? Open in Web Editor NEW
1.6K 22.0 202.0 84 KB

An AutoGPT agent that controls Chrome on your desktop

License: GNU General Public License v3.0

Makefile 0.92% Python 98.08% Dockerfile 1.00%
ai autogpt chatgpt langchain gpt-3-5-turbo gpt-4

chrome-gpt's Introduction

๐Ÿค– Chrome-GPT: An experimental AutoGPT agent that interacts with Chrome

lint test Twitter

โš ๏ธThis is an experimental AutoGPT agent that might take incorrect actions and could lead to serious consequences. Please use it at your own discretionโš ๏ธ

Chrome-GPT is an AutoGPT experiment that utilizes Langchain and Selenium to enable an AutoGPT agent take control of an entire Chrome session. With the ability to interactively scroll, click, and input text on web pages, the AutoGPT agent can navigate and manipulate web content.

๐Ÿ–ฅ๏ธ Demo

Input Prompt: Find me a bar that can host a 20 person event near Chelsea, Manhattan evening of Apr 30th. Fill out contact us form if they have one with info: Name Richard, email [email protected].

DEMO.mov

Demo made by Richard He

๐Ÿ”ฎ Features

  • ๐ŸŒŽ Google search
  • ๐Ÿง  Long-term and short-term memory management
  • ๐Ÿ”จ Chrome actions: describe a webpage, scroll to element, click on buttons/links, input forms, switch tabs
  • ๐Ÿค– Supports multiple agent types: Zero-shot, BabyAGI and Auto-GPT
  • ๐Ÿ”ฅ (IN PROGRESS) Chrome plugin support

๐Ÿงฑ Known Limitations

  • There are limited web crawling features, with buttons and input fields sometimes failing to appear in prompt.
  • The response time is slow, with each action taking between 1-10 seconds to run.
  • At times, langchain agents are unable to parse GPT outputs (refer to langchain discussion: langchain-ai/langchain#4065). If you run into this, try specifying a different agent; ie: python -m chromegpt -a auto-gpt -v -t "{your request}"

Requirements

  • Chrome
  • Python >3.8
  • Install Poetry

๐Ÿ› ๏ธ Setup

  1. Set up your OpenAI API Keys and add OPENAI_API_KEY env variable
  2. Install Python requirements via poetry poetry install
  3. Open a poetry shell poetry shell
  4. Run chromegpt via python -m chromegpt

You can start in you own codespace here:

Open in GitHub Codespaces

๐Ÿง  Usage

  • GPT-3.5 Usage (Default): python -m chromegpt -v -t "{your request}"
  • GPT-4 Usage (Recommended, needs GPT-4 access): python -m chromegpt -v -a auto-gpt -m gpt-4 -t "{your request}"
  • For help: python -m chromegpt --help
Usage: python -m chromegpt [OPTIONS]

  Run ChromeGPT: An AutoGPT agent that interacts with Chrome

Options:
  -t, --task TEXT                 The task to execute  [required]
  -a, --agent [auto-gpt|baby-agi|zero-shot]
                                  The agent type to use
  -m, --model TEXT                The model to use
  --headless                      Run in headless mode
  -v, --verbose                   Run in verbose mode
  --human-in-loop                 Run in human-in-loop mode, only available
                                  when using auto-gpt agent
  --help                          Show this message and exit.

Or Just update .env and

source .env && docker-compose up

โญ Star History

Star History Chart

chrome-gpt's People

Contributors

arthavruksha avatar baseinfinity avatar chengxuan-xia avatar erlichsefi avatar richardyc avatar xayaraj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chrome-gpt's Issues

langchain.schema.OutputParserException: Could not parse LLM output

I get this is a known issue but it happens 90% of the time, is there any way this can get improved or do we have to wait for 4.0 API key to see improvements? Sorry not exactly sure what the issue is and how to improve it.

Besides that, I was EXTREMELY impressed with this....the logic it took was fascinating and figuring out what elements to select was fantastic, but it wouldn't get past the first step 80% of the time but when it did it was great to see!

I would love to see this resolved because it's hard to demo this tool otherwise (I was planning to today but it crashes too much and I wanted to give it a fair shake before livestreaming it)

Would 4.0 API key improve this specific issue?

Please update a detailed demo example.

Please update a detailed demo example๐Ÿ™๐Ÿ™๐Ÿ™
If possible, it's best to take it step by step

  1. export OPENAI_API_KEY=xxx && python chrome gpt -t 'some demo infomations...'
  2. other stuff..

Unfortunately, can not understand what can i do by current demo video. ๐Ÿ˜‚๐Ÿ˜‚๐Ÿ˜‚

Can it use the debugger tools? Can it read and parse the source? Can it help me write e2e Cypress tests?

Title says it all.. those are the things I'm currently in search for. I want to be able to tell it to load a URL, and write me a Cypress script for interacting with elements on the website. It would need to find the relevant items I've asked it to interact with, perform the action but also capture the DOM of these elements so it can then use those to write a script for use with automated testing.

It would be pretty crazy if it can do that! Can it? And if not.. why not?

Many thanks for this exciting work you're doing!

Won't show Chrome GUI

The scripts renders web pages in headless mode, even though I have not added that flag.

Command: python -m chromegpt -a auto-gpt -v -t "visit google.com"

The code is run in a docker container on Windows 11.

Chrome dev mode is enabled.

Error: Got unexpected extra argument ()

(chrome-gpt-py3.11) PS C:\Users\Administrator\Desktop\git\chrome-gpt> python -m chromegpt -t open youtube
Usage: python -m chromegpt [OPTIONS]
Try 'python -m chromegpt --help' for help.

Error: Got unexpected extra argument (youtube)
(chrome-gpt-py3.11) PS C:\Users\Administrator\Desktop\git\chrome-gpt>

No matter what command I run, it always says it can't run because it has too many arguments.
Any idea how to fix it?

Regression?: Unable to run inside or outside of Docker

So with latest version of main, I am unable to get Docker running on my setup, most likely because I am on a M1 setup and the chromedriver fails to be found for my setup.

So then I figured I'd try to run it manually outside of Docker like I am used to and was unable to:

  File "/Users/stefanayala/Library/Caches/pypoetry/virtualenvs/chrome-gpt-eEQoRrTE-py3.11/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='selenium-chrome', port=4444): Max retries exceeded with url: /wd/hub/session (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x11344e910>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

Looks like it has trouble connecting to the grid

So then I went wanted to go back to last known working changes:

  • Rolled back changes to eb7cbce
  • poetry remove selenium
  • poetry add selenium

This installed the newer version of Selenium which allowed me to run ChromeGPT with latest version of Chrome (yay!).

Maybe someone else can validate that they are able to get this project running outside of Docker with latest version of main?

ModuleNotFoundError: No module named 'langchain.experimental'

After installing and executing all the setups in Windows 11 PowerShell command line, the following error occurred.

PS C:\Users\Jun\code\Chrome-GPT> python -m chromegpt Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "C:\Users\Jun\code\Chrome-GPT\chromegpt\__main__.py", line 4, in <module> from chromegpt.main import run_chromegpt File "C:\Users\Jun\code\Chrome-GPT\chromegpt\main.py", line 1, in <module> from chromegpt.agent.autogpt import AutoGPTAgent File "C:\Users\Jun\code\Chrome-GPT\chromegpt\agent\autogpt\__init__.py", line 2, in <module> from .autogpt import AutoGPTAgent # noqa: F401 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Jun\code\Chrome-GPT\chromegpt\agent\autogpt\autogpt.py", line 6, in <module> from langchain.experimental import AutoGPT ModuleNotFoundError: No module named 'langchain.experimental'

What else needs to be configured? Thank you.

NameError: name 'v_args' is not defined. Did you mean: 'vars'?

Hello,

I installed with:

git clone https://github.com/richardyc/Chrome-GPT.git
cd Chrome-GPT
poetry install
poetry shell
export OPENAI_API_KEY=<KEY>

I had to pip install click and langchain.

poetry --version 
Poetry (version 1.4.2)

But then when I run a command I get this:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/.../Chrome-GPT/Chrome-GPT/chromegpt/__main__.py", line 4, in <module>
    from chromegpt.main import run_chromegpt
  File "/home/.../Chrome-GPT/Chrome-GPT/chromegpt/main.py", line 1, in <module>
    from chromegpt.agent.autogpt import AutoGPTAgent
  File "/home/.../Chrome-GPT/Chrome-GPT/chromegpt/agent/autogpt/__init__.py", line 2, in <module>
    from .autogpt import AutoGPTAgent  # noqa: F401
  File "/home/.../Chrome-GPT/Chrome-GPT/chromegpt/agent/autogpt/autogpt.py", line 6, in <module>
    from langchain.experimental import AutoGPT
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/experimental/__init__.py", line 3, in <module>
    from langchain.experimental.generative_agents.generative_agent import GenerativeAgent
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/experimental/generative_agents/__init__.py", line 2, in <module>
    from langchain.experimental.generative_agents.generative_agent import GenerativeAgent
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/experimental/generative_agents/generative_agent.py", line 9, in <module>
    from langchain.experimental.generative_agents.memory import GenerativeAgentMemory
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/experimental/generative_agents/memory.py", line 8, in <module>
    from langchain.retrievers import TimeWeightedVectorStoreRetriever
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/retrievers/__init__.py", line 9, in <module>
    from langchain.retrievers.self_query.base import SelfQueryRetriever
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/retrievers/self_query/base.py", line 8, in <module>
    from langchain.chains.query_constructor.base import load_query_constructor_chain
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/chains/query_constructor/base.py", line 14, in <module>
    from langchain.chains.query_constructor.parser import get_parser
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/chains/query_constructor/parser.py", line 50, in <module>
    @v_args(inline=True)
NameError: name 'v_args' is not defined. Did you mean: 'vars'?

Any idea?

metaclass conflict

Hey Richard, Thanks for the code, but when I try to run it on my Windows Machine then I got the below error. Can you please let me know how can I resolve this.

Traceback (most recent call last):
File "", line 198, in run_module_as_main
File "", line 88, in run_code
File "D:\Agent\ChromGPTV2\chromegpt_main
.py", line 4, in
from chromegpt.main import run_chromegpt
File "D:\Agent\ChromGPTV2\chromegpt\main.py", line 1, in
from chromegpt.agent.autogpt import AutoGPTAgent
File "D:\Agent\ChromGPTV2\chromegpt\agent\autogpt_init
.py", line 2, in
from .autogpt import AutoGPTAgent # noqa: F401
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Agent\ChromGPTV2\chromegpt\agent\autogpt\autogpt.py", line 16, in
from chromegpt.agent.autogpt.prompt import AutoGPTPrompt
File "D:\Agent\ChromGPTV2\chromegpt\agent\autogpt\prompt.py", line 16, in
class AutoGPTPrompt(BaseChatPromptTemplate, BaseModel):
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

If any other folk can help me, then it will be great. Thanks

A few modules are not found and how to run

Click, Langchain, selenium, validators modules need to be installed prior to running the code if you have not got them already. Also >> poetry shell command does not work because it is not placed in the home directory. Therefore referring to this link may be helpful https://stackoverflow.com/questions/60768676/what-is-the-default-install-path-for-poetry.

After all these issues addressed, running "python -m chromegpt" produces an error -> Error: Missing option '--task' / '-t'. After revising the command to "python -m chromegpt --task", another error arises -> Error: Option '--task' requires an argument. To correct this, the comment in the General section should be noted -> python -m chromegpt -t "{your request}" and your request could be one of the options below: "auto-gpt", "baby-agi", or "zero-shot". One example would be python -m chromegpt -t "auto-gpt".

Selenium error on Chrome version

When starting chromeGPT, I get the following error:

selenium.common.exceptions.SessionNotCreatedException: Message: session not created: This version of ChromeDriver only supports Chrome version 111
Current browser version is 113.0.5672.63 with binary path C:\Program Files\Google\Chrome\Application\chrome.exe

Windows errors

(chrome-gpt-py3.10) PS C:\Chrome-GPT> python -m chromegpt -v -t "test"
Traceback (most recent call last):
File "C:\Users\nsk\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\nsk\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Chrome-GPT\chromegpt_main
.py", line 4, in
from chromegpt.main import run_chromegpt
File "C:\Chrome-GPT\chromegpt\main.py", line 3, in
from chromegpt.agent.zeroshot import BabyAGIAgent, ZeroShotAgent
File "C:\Chrome-GPT\chromegpt\agent\zeroshot.py", line 49
verbose=verbose,
IndentationError: unexpected indent

raise MaxRetryError

When I try to run and give a task to Chrome-GPT, I received this error

raise MaxRetryError(_pool, url, error or ResponseError(cause))

urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='selenium-chrome', port=4444): Max retries exceeded with url: /wd/hub/session (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x1199345d0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

ModuleNotFoundError: No module named 'langchain.experimental'

Any idea what I am getting the issue below?

I have installed langchain module many times.

python3 -m chromegpt --help
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/root/git/Chrome-GPT/chromegpt/main.py", line 4, in
from chromegpt.main import run_chromegpt
File "/root/git/Chrome-GPT/chromegpt/main.py", line 1, in
from chromegpt.agent.autogpt import AutoGPTAgent
File "/root/git/Chrome-GPT/chromegpt/agent/autogpt/init.py", line 2, in
from .autogpt import AutoGPTAgent # noqa: F401
File "/root/git/Chrome-GPT/chromegpt/agent/autogpt/autogpt.py", line 6, in
from langchain.experimental import AutoGPT
ModuleNotFoundError: No module named 'langchain.experimental'

Unable to install

Hello,
I tried on Fedora 37 without success ๐Ÿ˜ž

poetry install
Creating virtualenv chrome-gpt-znLkX84X-py3.11 in XXX

  RuntimeError

  The lock file is not compatible with the current version of Poetry.
  Upgrade Poetry to be able to read the lock file or, alternatively, regenerate the lock file with the `poetry lock` command.

  at /usr/lib/python3.11/site-packages/poetry/packages/locker.py:481 in _get_lock_data
      477โ”‚                 "Upgrade Poetry to ensure the lock file is read properly or, alternatively, "
      478โ”‚                 "regenerate the lock file with the `poetry lock` command."
      479โ”‚             )
      480โ”‚         elif not lock_version_allowed:
    โ†’ 481โ”‚             raise RuntimeError(
      482โ”‚                 "The lock file is not compatible with the current version of Poetry.\n"
      483โ”‚                 "Upgrade Poetry to be able to read the lock file or, alternatively, "
      484โ”‚                 "regenerate the lock file with the `poetry lock` command."
      485โ”‚             )

I tried to lock, same error.

Poetry version 1.1.14

How extract data

How to extract data from website

I try python -m chromegpt -v -t "on https://www.specialized.com/fr/fr/rockhopper-elite-27-5/p/199582\?color\=319847-199582 extract bike infos"
to get all bike information

prompt return:

The scroll function did not reveal the bike information. The find_form function did not return the bike information either. However, the click function successfully navigated to a page with the bike information. I will now manually extract the bike information.

Final Answer: The bike information for the Specialized Rockhopper Elite 27.5 can be found on the following page: https://www.specialized.com/fr/fr/rockhopper-elite-27-5/p/199582?color=319847-199582.

> Finished chain.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.