richardyc / chrome-gpt Goto Github PK

An AutoGPT agent that controls Chrome on your desktop

License: GNU General Public License v3.0

Makefile 0.92% Python 98.08% Dockerfile 1.00%

ai autogpt chatgpt langchain gpt-3-5-turbo gpt-4

chrome-gpt's Introduction

🤖 Chrome-GPT: An experimental AutoGPT agent that interacts with Chrome

⚠️This is an experimental AutoGPT agent that might take incorrect actions and could lead to serious consequences. Please use it at your own discretion⚠️

Chrome-GPT is an AutoGPT experiment that utilizes Langchain and Selenium to enable an AutoGPT agent take control of an entire Chrome session. With the ability to interactively scroll, click, and input text on web pages, the AutoGPT agent can navigate and manipulate web content.

🖥️ Demo

Input Prompt: Find me a bar that can host a 20 person event near Chelsea, Manhattan evening of Apr 30th. Fill out contact us form if they have one with info: Name Richard, email [email protected].

DEMO.mov

Demo made by Richard He

🔮 Features

🌎 Google search
🧠 Long-term and short-term memory management
🔨 Chrome actions: describe a webpage, scroll to element, click on buttons/links, input forms, switch tabs
🤖 Supports multiple agent types: Zero-shot, BabyAGI and Auto-GPT
🔥 (IN PROGRESS) Chrome plugin support

🧱 Known Limitations

There are limited web crawling features, with buttons and input fields sometimes failing to appear in prompt.
The response time is slow, with each action taking between 1-10 seconds to run.
At times, langchain agents are unable to parse GPT outputs (refer to langchain discussion: langchain-ai/langchain#4065). If you run into this, try specifying a different agent; ie: python -m chromegpt -a auto-gpt -v -t "{your request}"

Requirements

Chrome
Python >3.8
Install Poetry

🛠️ Setup

Set up your OpenAI API Keys and add OPENAI_API_KEY env variable
Install Python requirements via poetry poetry install
Open a poetry shell poetry shell
Run chromegpt via python -m chromegpt

You can start in you own codespace here:

🧠 Usage

GPT-3.5 Usage (Default): python -m chromegpt -v -t "{your request}"
GPT-4 Usage (Recommended, needs GPT-4 access): python -m chromegpt -v -a auto-gpt -m gpt-4 -t "{your request}"
For help: python -m chromegpt --help

Usage: python -m chromegpt [OPTIONS]

  Run ChromeGPT: An AutoGPT agent that interacts with Chrome

Options:
  -t, --task TEXT                 The task to execute  [required]
  -a, --agent [auto-gpt|baby-agi|zero-shot]
                                  The agent type to use
  -m, --model TEXT                The model to use
  --headless                      Run in headless mode
  -v, --verbose                   Run in verbose mode
  --human-in-loop                 Run in human-in-loop mode, only available
                                  when using auto-gpt agent
  --help                          Show this message and exit.

Or Just update .env and

source .env && docker-compose up

⭐ Star History

chrome-gpt's People

Contributors

Stargazers

Watchers

Forkers

ananth-manivannan jphme thecuratorcm davo enkaybit matthew-mskim singlet-hoodies lowdias xayaraj swayducky hw26 s1x-data-team teamchong cbryg mduecker tzengwei edskamor chenzhong89 catgirl69 scotterbrain milancr mrudeoc therealjayquinn bomanx mo-bay itsbrex manwaltep unsalc git-abouvier mohamed8tair kingler realsuperheavy helloscribe tonyxia2016 xjaroo stracerxx razaci 0x1of1 manojsaharan01 thejerk400 anjing137 meezyart fortyplusdev farcode-io gladiopeace mexicanamerican viacheslav-romanov traxverlis ard-skelling mali1sav cyrilmagsuci stevegyutyan xcytxs cat-stack-boop nsk doytsujin dhrubasumatary jinwoongyoo oijoijcoiejoijce vineedkaladharan kyuuuw chxmlmn dutchosintguy abhishek-yadv kayodebristol kmfernan5 yfeng997 pierizvi novan2020 khughes11 swisscakerowl namervin aicodehunt annias 1602199623 ai-alebrijecircus-x patrickdreamer rohanmuz2 cjrujo hhy5277 meetpateltech swifilaboroka pllz7 itsharex sirhof goldzulu jgilleran drgonzalomora macromuppet ko9ma7 ai-jie01 ethanthai2 harfangcto arthavruksha soumyakants4 alexkissijr mer163 scyalex jackalwu2019 jeromyjsmith

chrome-gpt's Issues

langchain.schema.OutputParserException: Could not parse LLM output

I get this is a known issue but it happens 90% of the time, is there any way this can get improved or do we have to wait for 4.0 API key to see improvements? Sorry not exactly sure what the issue is and how to improve it.

Besides that, I was EXTREMELY impressed with this....the logic it took was fascinating and figuring out what elements to select was fantastic, but it wouldn't get past the first step 80% of the time but when it did it was great to see!

I would love to see this resolved because it's hard to demo this tool otherwise (I was planning to today but it crashes too much and I wanted to give it a fair shake before livestreaming it)

Would 4.0 API key improve this specific issue?

Please update a detailed demo example.

Please update a detailed demo example🙏🙏🙏
If possible, it's best to take it step by step

export OPENAI_API_KEY=xxx && python chrome gpt -t 'some demo infomations...'
other stuff..

Unfortunately, can not understand what can i do by current demo video. 😂😂😂

Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a named parameter.

I've exported the API key and still am having issues getting it to read:

I also searched the REPO for use case of both OPENAI_API_KEY and openai_api_key and found nothing so I'm a bit confused

Can it use the debugger tools? Can it read and parse the source? Can it help me write e2e Cypress tests?

Title says it all.. those are the things I'm currently in search for. I want to be able to tell it to load a URL, and write me a Cypress script for interacting with elements on the website. It would need to find the relevant items I've asked it to interact with, perform the action but also capture the DOM of these elements so it can then use those to write a script for use with automated testing.

It would be pretty crazy if it can do that! Can it? And if not.. why not?

Many thanks for this exciting work you're doing!

Won't show Chrome GUI

The scripts renders web pages in headless mode, even though I have not added that flag.

Command: python -m chromegpt -a auto-gpt -v -t "visit google.com"

The code is run in a docker container on Windows 11.

Chrome dev mode is enabled.

Error: Got unexpected extra argument ()

(chrome-gpt-py3.11) PS C:\Users\Administrator\Desktop\git\chrome-gpt> python -m chromegpt -t open youtube
Usage: python -m chromegpt [OPTIONS]
Try 'python -m chromegpt --help' for help.

Error: Got unexpected extra argument (youtube)
(chrome-gpt-py3.11) PS C:\Users\Administrator\Desktop\git\chrome-gpt>

No matter what command I run, it always says it can't run because it has too many arguments.
Any idea how to fix it?

I would like to perform special webdriver actions on shutdown

Where in the source could I tweak to perform special Webdriver actions whenever the agent is done? I would like to perform a browser action always before closing the browser, thanks!

Regression?: Unable to run inside or outside of Docker

So with latest version of main, I am unable to get Docker running on my setup, most likely because I am on a M1 setup and the chromedriver fails to be found for my setup.

So then I figured I'd try to run it manually outside of Docker like I am used to and was unable to:

  File "/Users/stefanayala/Library/Caches/pypoetry/virtualenvs/chrome-gpt-eEQoRrTE-py3.11/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='selenium-chrome', port=4444): Max retries exceeded with url: /wd/hub/session (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x11344e910>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

Looks like it has trouble connecting to the grid

So then I went wanted to go back to last known working changes:

Rolled back changes to eb7cbce
poetry remove selenium
poetry add selenium

This installed the newer version of Selenium which allowed me to run ChromeGPT with latest version of Chrome (yay!).

Maybe someone else can validate that they are able to get this project running outside of Docker with latest version of main?

Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a named parameter. (type=value_error)

It seems like the .env file I've copy-pasted from AutoGpt is not working.

Questions :
1 - Where is the template for the .env file ?
2 - How to pass the openai_api_key while running ChromeGPT ?

Thanks in advance.

-

ModuleNotFoundError: No module named 'langchain.experimental'

After installing and executing all the setups in Windows 11 PowerShell command line, the following error occurred.

PS C:\Users\Jun\code\Chrome-GPT> python -m chromegpt Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "C:\Users\Jun\code\Chrome-GPT\chromegpt\__main__.py", line 4, in <module> from chromegpt.main import run_chromegpt File "C:\Users\Jun\code\Chrome-GPT\chromegpt\main.py", line 1, in <module> from chromegpt.agent.autogpt import AutoGPTAgent File "C:\Users\Jun\code\Chrome-GPT\chromegpt\agent\autogpt\__init__.py", line 2, in <module> from .autogpt import AutoGPTAgent # noqa: F401 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Jun\code\Chrome-GPT\chromegpt\agent\autogpt\autogpt.py", line 6, in <module> from langchain.experimental import AutoGPT ModuleNotFoundError: No module named 'langchain.experimental'

What else needs to be configured? Thank you.

NameError: name 'v_args' is not defined. Did you mean: 'vars'?

Hello,

I installed with:

git clone https://github.com/richardyc/Chrome-GPT.git
cd Chrome-GPT
poetry install
poetry shell
export OPENAI_API_KEY=<KEY>

I had to pip install click and langchain.

poetry --version 
Poetry (version 1.4.2)

But then when I run a command I get this:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/.../Chrome-GPT/Chrome-GPT/chromegpt/__main__.py", line 4, in <module>
    from chromegpt.main import run_chromegpt
  File "/home/.../Chrome-GPT/Chrome-GPT/chromegpt/main.py", line 1, in <module>
    from chromegpt.agent.autogpt import AutoGPTAgent
  File "/home/.../Chrome-GPT/Chrome-GPT/chromegpt/agent/autogpt/__init__.py", line 2, in <module>
    from .autogpt import AutoGPTAgent  # noqa: F401
  File "/home/.../Chrome-GPT/Chrome-GPT/chromegpt/agent/autogpt/autogpt.py", line 6, in <module>
    from langchain.experimental import AutoGPT
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/experimental/__init__.py", line 3, in <module>
    from langchain.experimental.generative_agents.generative_agent import GenerativeAgent
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/experimental/generative_agents/__init__.py", line 2, in <module>
    from langchain.experimental.generative_agents.generative_agent import GenerativeAgent
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/experimental/generative_agents/generative_agent.py", line 9, in <module>
    from langchain.experimental.generative_agents.memory import GenerativeAgentMemory
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/experimental/generative_agents/memory.py", line 8, in <module>
    from langchain.retrievers import TimeWeightedVectorStoreRetriever
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/retrievers/__init__.py", line 9, in <module>
    from langchain.retrievers.self_query.base import SelfQueryRetriever
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/retrievers/self_query/base.py", line 8, in <module>
    from langchain.chains.query_constructor.base import load_query_constructor_chain
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/chains/query_constructor/base.py", line 14, in <module>
    from langchain.chains.query_constructor.parser import get_parser
  File "/home/.../.cache/pypoetry/virtualenvs/chrome-gpt-WWC1HBYi-py3.10/lib/python3.10/site-packages/langchain/chains/query_constructor/parser.py", line 50, in <module>
    @v_args(inline=True)
NameError: name 'v_args' is not defined. Did you mean: 'vars'?

Any idea?

'SeleniumWrapper' object has no attribute 'driver'

New here, so maybe this is an easy fix. Thanks for the help!

metaclass conflict

Hey Richard, Thanks for the code, but when I try to run it on my Windows Machine then I got the below error. Can you please let me know how can I resolve this.

Traceback (most recent call last):
File "", line 198, in run_module_as_main
File "", line 88, in run_code
File "D:\Agent\ChromGPTV2\chromegpt_main.py", line 4, in
from chromegpt.main import run_chromegpt
File "D:\Agent\ChromGPTV2\chromegpt\main.py", line 1, in
from chromegpt.agent.autogpt import AutoGPTAgent
File "D:\Agent\ChromGPTV2\chromegpt\agent\autogpt_init.py", line 2, in
from .autogpt import AutoGPTAgent # noqa: F401
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Agent\ChromGPTV2\chromegpt\agent\autogpt\autogpt.py", line 16, in
from chromegpt.agent.autogpt.prompt import AutoGPTPrompt
File "D:\Agent\ChromGPTV2\chromegpt\agent\autogpt\prompt.py", line 16, in
class AutoGPTPrompt(BaseChatPromptTemplate, BaseModel):
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

If any other folk can help me, then it will be great. Thanks

A few modules are not found and how to run

Click, Langchain, selenium, validators modules need to be installed prior to running the code if you have not got them already. Also >> poetry shell command does not work because it is not placed in the home directory. Therefore referring to this link may be helpful https://stackoverflow.com/questions/60768676/what-is-the-default-install-path-for-poetry.

After all these issues addressed, running "python -m chromegpt" produces an error -> Error: Missing option '--task' / '-t'. After revising the command to "python -m chromegpt --task", another error arises -> Error: Option '--task' requires an argument. To correct this, the comment in the General section should be noted -> python -m chromegpt -t "{your request}" and your request could be one of the options below: "auto-gpt", "baby-agi", or "zero-shot". One example would be python -m chromegpt -t "auto-gpt".

Selenium error on Chrome version

When starting chromeGPT, I get the following error:

selenium.common.exceptions.SessionNotCreatedException: Message: session not created: This version of ChromeDriver only supports Chrome version 111
Current browser version is 113.0.5672.63 with binary path C:\Program Files\Google\Chrome\Application\chrome.exe

Is it possible to load extensions?

Curious what it would take to load a Chrome Extension

Non-continuous mode (asks for review/change intent)

Can you port over the semi/non-continuous mode features from the auto-gpt repo to this, so we can change the model's task/directives/direction every N responses?

How to login into web pages?

I want to automate a download process from https://sellercentral.amazon.de/payments/allstatements/index.html . This page, as a lot of other pages, needs a login. When I login manually the process still crashes.

I've seen that Selenium is used but I didn't find any good way to add logins. Does anyone have an idea on how to solve the issue?

Where to enter API key?

Where exactly to enter the api key, I could not find information anywhere.

Windows errors

(chrome-gpt-py3.10) PS C:\Chrome-GPT> python -m chromegpt -v -t "test"
Traceback (most recent call last):
File "C:\Users\nsk\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\nsk\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Chrome-GPT\chromegpt_main.py", line 4, in
from chromegpt.main import run_chromegpt
File "C:\Chrome-GPT\chromegpt\main.py", line 3, in
from chromegpt.agent.zeroshot import BabyAGIAgent, ZeroShotAgent
File "C:\Chrome-GPT\chromegpt\agent\zeroshot.py", line 49
verbose=verbose,
IndentationError: unexpected indent

raise MaxRetryError

When I try to run and give a task to Chrome-GPT, I received this error

raise MaxRetryError(_pool, url, error or ResponseError(cause))

urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='selenium-chrome', port=4444): Max retries exceeded with url: /wd/hub/session (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x1199345d0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

ModuleNotFoundError: No module named 'langchain.experimental'

Any idea what I am getting the issue below?

I have installed langchain module many times.

python3 -m chromegpt --help
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/root/git/Chrome-GPT/chromegpt/main.py", line 4, in
from chromegpt.main import run_chromegpt
File "/root/git/Chrome-GPT/chromegpt/main.py", line 1, in
from chromegpt.agent.autogpt import AutoGPTAgent
File "/root/git/Chrome-GPT/chromegpt/agent/autogpt/init.py", line 2, in
from .autogpt import AutoGPTAgent # noqa: F401
File "/root/git/Chrome-GPT/chromegpt/agent/autogpt/autogpt.py", line 6, in
from langchain.experimental import AutoGPT
ModuleNotFoundError: No module named 'langchain.experimental'

Unable to install

Hello,
I tried on Fedora 37 without success 😞

poetry install
Creating virtualenv chrome-gpt-znLkX84X-py3.11 in XXX

  RuntimeError

  The lock file is not compatible with the current version of Poetry.
  Upgrade Poetry to be able to read the lock file or, alternatively, regenerate the lock file with the `poetry lock` command.

  at /usr/lib/python3.11/site-packages/poetry/packages/locker.py:481 in _get_lock_data
      477│                 "Upgrade Poetry to ensure the lock file is read properly or, alternatively, "
      478│                 "regenerate the lock file with the `poetry lock` command."
      479│             )
      480│         elif not lock_version_allowed:
    → 481│             raise RuntimeError(
      482│                 "The lock file is not compatible with the current version of Poetry.\n"
      483│                 "Upgrade Poetry to be able to read the lock file or, alternatively, "
      484│                 "regenerate the lock file with the `poetry lock` command."
      485│             )

I tried to lock, same error.

Poetry version 1.1.14

How to login into web pages?

I am seeking to automate the download process from https://sellercentral.amazon.de/payments/allstatements/index.html. Like many other pages, this one requires a login.

However, even when logging in manually, the process continues to crash.

I have noticed that Selenium is used for this purpose, but I have not yet discovered an effective method to incorporate logins. Does anyone have suggestions on how to address this issue?

Note. If you must access openai by VPN, you may encounter the following issues.

youngfreeFJS#2

How extract data

How to extract data from website

I try python -m chromegpt -v -t "on https://www.specialized.com/fr/fr/rockhopper-elite-27-5/p/199582\?color\=319847-199582 extract bike infos"
to get all bike information

prompt return:

The scroll function did not reveal the bike information. The find_form function did not return the bike information either. However, the click function successfully navigated to a page with the bike information. I will now manually extract the bike information.

Final Answer: The bike information for the Specialized Rockhopper Elite 27.5 can be found on the following page: https://www.specialized.com/fr/fr/rockhopper-elite-27-5/p/199582?color=319847-199582.

> Finished chain.

FileNotFoundError: [Errno 2] No such file or directory: 'chromedriver'

Any chance you know where I'm going wrong here. Thanks!