GithubHelp home page GithubHelp logo

robusta-dev / holmesgpt Goto Github PK

View Code? Open in Web Editor NEW
287.0 8.0 25.0 2.97 MB

On-Call/DevOps Assistant - Get a head start on fixing alerts with AI investigation

License: MIT License

Dockerfile 2.41% Python 85.21% Jinja 11.35% Shell 1.03%
aiops kubernetes llm llm-agent llm-framework llms monitoring observability prometheus chatbot chatops devops incident incident-management incident-response sre jira slack devops-tools site-reliability-engineering

holmesgpt's Introduction

Get a head start on fixing alerts with AI investigation

HolmesGPT - The Open Source On-Call/DevOps Agent

Examples | Key Features | Installation | YouTube Demo

The only AI assistant that investigates incidents like a human does - by looking at alerts and fetching missing data until it finds the root cause. Powered by OpenAI, Azure AI, AWS Bedrock, or any tool-calling LLM of your choice, including open source models.

What Can HolmesGPT Do?

  • Investigate Incidents (AIOps) from PagerDuty/OpsGenie/Prometheus/Jira/more
  • Bidirectional Integrations see investigation results inside your existing ticketing/incident management system
  • Automated Triage: Use HolmesGPT as a first responder. Flag critical alerts and prioritize them for your team to look at
  • Alert Enrichment: Automatically add context to alerts - like logs and microservice health info - to find root causes faster
  • Identify Cloud Problems by asking HolmesGPT questions about unhealthy infrastructure
  • Runbook Automation in Plain English: Speed up your response to known issues by investigating according to runbooks you provide

See it in Action

AI Alert Analysis

Examples

Kubernetes Troubleshooting
holmes ask "what pods are unhealthy in my cluster and why?"
Prometheus Alert RCA (root cause analysis)

Investigate Prometheus alerts right from Slack with the official Robusta integration.

342708962-e0c9ccde-299e-41d7-84e3-c201277a9ccb (1)

Or run HolmesGPT from the cli:

kubectl port-forward alertmanager-robusta-kube-prometheus-st-alertmanager-0 9093:9093 &
holmes investigate alertmanager --alertmanager-url http://localhost:9093

Note - if on Mac OS and using the Docker image, you will need to use http://docker.for.mac.localhost:9093 instead of http://localhost:9093

Log File Analysis

Attach files to the HolmesGPT session with -f:

sudo dmesg > dmesg.log
poetry run python3 holmes.py ask "investigate errors in this dmesg log" -f dmesg.log
Jira Ticket Investigation
holmes investigate jira --jira-url https://<PLACEDHOLDER>.atlassian.net --jira-username <PLACEHOLDER_EMAIL> --jira-api-key <PLACEHOLDER_API_KEY>

By default results are displayed in the CLI . Use --update to get the results as a comment in the Jira ticket.

GitHub Issue Investigation
holmes investigate github --github-url https://<PLACEHOLDER> --github-owner <PLACEHOLDER_OWNER_NAME> --github-repository <PLACEHOLDER_GITHUB_REPOSITORY> --github-pat <PLACEHOLDER_GITHUB_PAT>

By default results are displayed in the CLI. Use --update to get the results as a comment in the GitHub issue.

OpsGenie Alert Investigation
holmes investigate opsgenie --opsgenie-api-key <PLACEHOLDER_APIKEY>

By default results are displayed in the CLI . Use --update --opsgenie-team-integration-key <PLACEHOLDER_TEAM_KEY> to get the results as a comment in the OpsGenie alerts. Refer to the CLI help for more info.

OpsGenie

PagerDuty Incident Investigation
holmes investigate pagerduty --pagerduty-api-key <PLACEHOLDER_APIKEY>

By default results are displayed in the CLI. Use --update --pagerduty-user-email <PLACEHOLDER_EMAIL> to get the results as a comment in the PagerDuty issue. Refer to the CLI help for more info.

PagerDuty

K9s Plugin

You can add HolmesGPT as a plugin for K9s to investigate why any Kubernetes resource is unhealthy.

Add the following contents to the K9s plugin file, typically ~/.config/k9s/plugins.yaml on Linux and ~/Library/Application Support/k9s/plugins.yaml on Mac .Read more about K9s plugins here and check your plugin path here.

Note: HolmesGPT must be installed and configured for the K9s plugin to work.

Basic plugin to run an investigation on any Kubernetes object, using the shortcut Shift + H

plugins:
  holmesgpt:
    shortCut: Shift-H 
    description: Ask HolmesGPT 
    scopes:
      - all 
    command: bash
    background: false
    confirm: false
    args:
      - -c
      - |
        holmes ask "why is $NAME of $RESOURCE_NAME in -n $NAMESPACE not working as expected"
        echo "Press 'q' to exit"
        while : ; do
        read -n 1 k <&1
        if [[ $k = q ]] ; then
        break
        fi
        done

Advanced plugin that lets you modify the questions HolmesGPT asks about the LLM, using the shortcut Shift + O. (E.g. you can change the question to "generate an HPA for this deployment" and the AI will follow those instructions and output an HPA configuration.)

plugins:
  custom-holmesgpt:
    shortCut: Shift-Q
    description: Custom HolmesGPT Ask
    scopes:
      - all 
    command: bash
    background: false
    confirm: false
    args:
      - -c
      - |
        INSTRUCTIONS="# Edit the line below. Lines starting with '#' will be ignored."
        DEFAULT_ASK_COMMAND="why is $NAME of $RESOURCE_NAME in -n $NAMESPACE not working as expected"

        echo "$INSTRUCTIONS" > temp-ask.txt
        echo "$DEFAULT_ASK_COMMAND" >> temp-ask.txt

        # Open the line in the default text editor
        ${EDITOR:-nano} temp-ask.txt

        # Read the modified line, ignoring lines starting with '#'
        user_input=$(grep -v '^#' temp-ask.txt)

        echo running: holmes ask "\"$user_input\""
        holmes ask "$user_input"
        echo "Press 'q' to exit"
        while : ; do
        read -n 1 k <&1
        if [[ $k = q ]] ; then
        break
        fi
        done

Like what you see? Checkout other use cases or get started by installing HolmesGPT.

Key Features

  • Connects to Existing Observability Data: Find correlations you didn’t know about. No need to gather new data or add instrumentation.
  • Compliance Friendly: Can be run on-premise with your own LLM (or in the cloud with OpenAI/Azure/AWS)
  • Transparent Results: See a log of the AI’s actions and what data it gathered to understand how it reached conclusions
  • Extensible Data Sources: Connect the AI to custom data by providing your own tool definitions
  • Runbook Automation: Optionally provide runbooks in plain English and the AI will follow them automatically
  • Integrates with Existing Workflows: Connect Slack and Jira to get results inside your existing tools

Installation

Prerequisite: Get an API key for a supported LLM.

Installation Methods:

Brew (Mac/Linux)
  1. Add our tap:
brew tap robusta-dev/homebrew-holmesgpt
  1. Install holmesgpt:
brew install holmesgpt
  1. Check that installation was successful. This will take a few seconds on the first run - wait patiently.:
holmes --help
  1. Run holmesgpt:
holmes ask "what issues do I have in my cluster"
Prebuilt Docker Container

Run the prebuilt Docker container docker.pkg.dev/genuine-flight-317411/devel/holmes-dev, with extra flags to mount relevant config files (so that kubectl and other tools can access AWS/GCP resources using your local machine's credentials)

docker run -it --net=host -v ~/.holmes:/root/.holmes -v ~/.aws:/root/.aws -v ~/.config/gcloud:/root/.config/gcloud -v $HOME/.kube/config:/root/.kube/config us-central1-docker.pkg.dev/genuine-flight-317411/devel/holmes-dev ask "what pods are unhealthy and why?"
Cutting Edge (Pip and Pipx)

You can install HolmesGPT from the latest git version with pip or pipx.

We recommend using pipx because it guarantees that HolmesGPT is isolated from other python packages on your system, preventing dependency conflicts.

First Pipx (skip this step if you are using pip).

Then install HolmesGPT from git with either pip or pipx:

pipx install "https://github.com/robusta-dev/holmesgpt/archive/refs/heads/master.zip"

Verify that HolmesGPT was installed by checking the version:

holmes version

To upgrade HolmesGPT with pipx, you can run:

pipx upgrade holmesgpt
From Source (Python Poetry)

First install poetry (the python package manager)

git clone https://github.com/robusta-dev/holmesgpt.git
cd holmesgpt
poetry install --no-root
poetry run python3 holmes.py ask "what pods are unhealthy and why?"
From Source (Docker)

Clone the project from github, and then run:

cd holmesgpt
docker build -t holmes . -f Dockerfile.dev
docker run -it --net=host -v -v ~/.holmes:/root/.holmes -v ~/.aws:/root/.aws -v ~/.config/gcloud:/root/.config/gcloud -v $HOME/.kube/config:/root/.kube/config holmes ask "what pods are unhealthy and why?"
Run HolmesGPT in your cluster (Helm)

Most users should install Holmes using the instructions in the Robusta docs ↗ and NOT the below instructions.

By using the Robusta integration you’ll benefit from an end-to-end integration that integrates with Prometheus alerts and Slack. Using the below instructions you’ll have to build many of those components yourself.

In this mode, all the parameters should be passed to the HolmesGPT deployment, using environment variables.

We recommend pulling sensitive variables from Kubernetes secrets.

First, you'll need to create your holmes-values.yaml file, for example:

additionalEnvVars:
- name: MODEL
  value: gpt-4o
- name: OPENAI_API_KEY
  value: <your open ai key>

Then, install with helm;

helm repo add robusta https://robusta-charts.storage.googleapis.com && helm repo update
helm install holmes robusta/holmes -f holmes-values.yaml

For all LLMs you need to provide the MODEL environment variable, which specifies which model you are using.

Some LLMs requires additional variables:

OpenAI

For OpenAI, only the model and api-key should be provided

additionalEnvVars:
- name: MODEL
  value: gpt-4o
- name: OPENAI_API_KEY
  valueFrom:
    secretKeyRef:
      name: my-holmes-secret
      key: openAiKey

Note: gpt-4o is optional since it's default model.

Azure OpenAI

To work with Azure AI, you need to provide the below variables:

additionalEnvVars:
- name: MODEL
  value: azure/my-azure-deployment         # your azure deployment name
- name: AZURE_API_VERSION
  value: 2024-02-15-preview                # azure openai api version
- name: AZURE_API_BASE
  value: https://my-org.openai.azure.com/  # base azure openai url
- name: AZURE_API_KEY
  valueFrom:
    secretKeyRef:
      name: my-holmes-secret
      key: azureOpenAiKey
AWS Bedrock
enablePostProcessing: true
additionalEnvVars:
- name: MODEL
  value: bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0 
- name: AWS_REGION_NAME
  value: us-east-1
- name: AWS_ACCESS_KEY_ID
  valueFrom:
    secretKeyRef:
      name: my-holmes-secret
      key: awsAccessKeyId
- name: AWS_SECRET_ACCESS_KEY
  valueFrom:
    secretKeyRef:
      name: my-holmes-secret
      key: awsSecretAccessKey

Note: bedrock claude provides better results when using post-processing to summarize the results.

Getting an API Key

HolmesGPT requires an LLM API Key to function. The most common option is OpenAI, but many LiteLLM-compatible models are supported. To use an LLM, set --model (e.g. gpt-4o or bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0) and --api-key (if necessary). Depending on the provider, you may need to set environment variables too.

Instructions for popular LLMs:

OpenAI

To work with OpenAI’s GPT 3.5 or GPT-4 models you need a paid OpenAI API key.

Note: This is different from being a “ChatGPT Plus” subscriber.

Pass your API key to holmes with the --api-key cli argument. Because OpenAI is the default LLM, the --model flag is optional for OpenAI (gpt-4o is the default).

holmes ask --api-key="..." "what pods are crashing in my cluster and why?"

If you prefer not to pass secrets on the cli, set the OPENAI_API_KEY environment variable or save the API key in a HolmesGPT config file.

Azure OpenAI

To work with Azure AI, you need an Azure OpenAI resource and to set the following environment variables:

  • AZURE_API_VERSION - e.g. 2024-02-15-preview
  • AZURE_API_BASE - e.g. https://my-org.openai.azure.com/
  • AZURE_API_KEY (optional) - equivalent to the --api-key cli argument

Set those environment variables and run:

holmes ask "what pods are unhealthy and why?" --model=azure/<DEPLOYMENT_NAME> --api-key=<API_KEY>

Refer LiteLLM Azure docs ↗ for more details.

AWS Bedrock

Before running the below command you must run pip install boto3>=1.28.57 and set the following environment variables:

  • AWS_REGION_NAME
  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY

If the AWS cli is already configured on your machine, you may be able to find those parameters with:

cat ~/.aws/credentials ~/.aws/config

Once everything is configured, run:

holmes ask "what pods are unhealthy and why?" --model=bedrock/<MODEL_NAME>

Be sure to replace MODEL_NAME with a model you have access to - e.g. anthropic.claude-3-5-sonnet-20240620-v1:0. To list models your account can access:

aws bedrock list-foundation-models --region=us-east-1

Note that different models are available in different regions. For example, Claude Opus is only available in us-west-2.

Refer to LiteLLM Bedrock docs ↗ for more details.

Using a self-hosted LLM

You will need an LLM with support for function-calling (tool-calling).

  • Set the environment variable for your URL with OPENAI_API_BASE
  • Set the model as openai/<your-model-name> (e.g., llama3.1:latest)
  • Set your API key (if your URL doesn't require a key, then add a random value for --api-key)
export OPENAI_API_BASE=<URL_HERE>
holmes ask "what pods are unhealthy and why?" --model=openai/<MODEL_NAME> --api-key=<API_KEY_HERE>

Important: Please verify that your model and inference server support function calling! HolmesGPT is currently unable to check if the LLM it was given supports function-calling or not. Some models that lack function-calling capabilities will hallucinate answers instead of reporting that they are unable to call functions. This behaviour depends on the model.

In particular, note that vLLM does not yet support function calling, whereas llama-cpp does support it.

Other Use Cases

HolmesGPT is usually used for incident response, but it can function as a general-purpose DevOps assistant too. Here are some examples:

Ask Questions About Your Cloud
holmes ask "what services does my cluster expose externally?"
Ticket Management - Automatically Respond to Jira tickets related to DevOps tasks
holmes investigate jira  --jira-url https://<PLACEDHOLDER>.atlassian.net --jira-username <PLACEHOLDER_EMAIL> --jira-api-key <PLACEHOLDER_API_KEY>
Find the right configuration to change in big Helm charts

LLM uses the built-in Helm toolset to gather information.

holmes ask "what helm value should I change to increase memory request of the my-argo-cd-argocd-server-6864949974-lzp6m pod"
Optimize Docker container size

LLM uses the built-in Docker toolset to gather information.

holmes ask "Tell me what layers of my pavangudiwada/robusta-ai docker image consume the most storage and suggest some fixes to it"

Customizing HolmesGPT

HolmesGPT can investigate many issues out of the box, with no customization or training.

That said, we provide several extension points for teaching HolmesGPT to investigate your issues, according to your best practices. The two main extension points are:

  • Custom Tools - give HolmesGPT access to data that it can't otherwise access - e.g. traces, APM data, or custom APIs
  • Custom Runbooks - give HolmesGPT instructions for investigating specific issues it otherwise wouldn't know how to handle
Add Custom Tools

The more data you give HolmesGPT, the better it will perform. Give it access to more data by adding custom tools.

New tools are loaded using -t from custom toolset files or by adding them to the ~/.holmes/config.yaml with the setting custom_toolsets: ["/path/to/toolset.yaml"].

Add Custom Runbooks

HolmesGPT can investigate by following runbooks written in plain English. Add your own runbooks to provided the LLM specific instructions.

New runbooks are loaded using -r from custom runbook files or by adding them to the ~/.holmes/config.yaml with the custom_runbooks: ["/path/to/runbook.yaml"].

Reading settings from a config file

You can customize HolmesGPT's behaviour with command line flags, or you can save common settings in config file for re-use.

You can view an example config file with all available settings here.

By default, without specifying --config the agent will try to read ~/.holmes/config.yaml. When settings are present in both config file and cli, the cli option takes precedence.

Custom Toolsets

You can define your own custom toolsets to extend the functionality of your setup. These toolsets can include querying company-specific data, fetching logs from observability tools, and more.

# Add paths to your custom toolsets here
# Example: ["path/to/your/custom_toolset.yaml"]
#custom_toolsets: ["examples/custom_toolset.yaml"]
Alertmanager Configuration

Configure the URL for your Alertmanager instance to enable alert management and notifications.

# URL for the Alertmanager
#alertmanager_url: "http://localhost:9093"
Jira Integration

Integrate with Jira to automate issue tracking and project management tasks. Provide your Jira credentials and specify the query to fetch issues and optionally update their status.

# Jira credentials and query settings
#jira_username: "[email protected]"
#jira_api_key: "..."
#jira_url: "https://your-company.atlassian.net"
#jira_query: "project = 'Natan Test Project' and Status = 'To Do'"
  1. jira_username: The email you use to log into your Jira account. Eg: [email protected]
  2. jira_api_key: Follow these instructions to get your API key.
  3. jira_url: The URL of your workspace. For example: https://workspace.atlassian.net (Note: schema (https) is required)
  4. project: Name of the project you want the Jira tickets to be created in. Go to Project Settings -> Details -> Name.
  5. status: Status of a ticket. Example: To Do, In Progress
GitHub Integration

Integrate with GitHub to automate issue tracking and project management tasks. Provide your GitHub PAT (personal access token) and specify the owner/repository.

# GitHub credentials and query settings
#github_owner: "robusta-dev"
#github_pat: "..."
#github_url: "https://api.github.com" (default)
#github_repository: "holmesgpt"
#github_query: "is:issue is:open"
  1. github_owner: The repository owner. Eg: robusta-dev
  2. github_pat: Follow these instructions to get your GitHub pat (personal access token).
  3. github_url: The URL of your GitHub API. For example: https://api.github.com (Note: schema (https) is required)
  4. github_repository: Name of the repository you want the GitHub issues to be scanned. Eg: holmesgpt.
PagerDuty Integration

Integrate with PagerDuty to automate incident tracking and project management tasks. Provide your PagerDuty credentials and specify the user email to update the incident with findings.

pagerduty_api_key: "..."
pagerduty_user_email: "[email protected]"
pagerduty_incident_key:  "..."
  1. pagerduty_api_key: The PagerDuty API key. This can be found in the PagerDuty UI under Integrations > API Access Key.
  2. pagerduty_user_email: When --update is set, which user will be listed as the user who updated the incident. (Must be the email of a valid user in your PagerDuty account.)
  3. pagerduty_incident_key: If provided, only analyze a single PagerDuty incident matching this key
OpsGenie Integration

Integrate with OpsGenie to automate alert investigations. Provide your OpsGenie credentials and specify the query to fetch alerts.

opsgenie_api_key : "..."
opsgenie-team-integration-key: "...."
opsgenie-query: "..."
  1. opsgenie_api_key: The OpsGenie API key. Get it from Settings > API key management > Add new API key
  2. opsgenie-team-integration-key: OpsGenie Team Integration key for writing back results. (NOT a normal API Key.) Get it from Teams > YourTeamName > Integrations > Add Integration > API Key. Don't forget to turn on the integration and add the Team as Responders to the alert.
  3. opsgenie-query: E.g. 'message: Foo' (see https://support.atlassian.com/opsgenie/docs/search-queries-for-alerts/)
Slack Integration

Configure Slack to send notifications to specific channels. Provide your Slack token and the desired channel for notifications.

# Slack token and channel configuration
#slack_token: "..."
#slack_channel: "#general"
  1. slack-token: The Slack API key. You can generate with pip install robusta-cli && robusta integrations slack
  2. slack-channel: The Slack channel where you want to receive the findings.
Custom Runbooks

Define custom runbooks to give explicit instructions to the LLM on how to investigate certain alerts. This can help in achieving better results for known alerts.

# Add paths to your custom runbooks here
# Example: ["path/to/your/custom_runbook.yaml"]
#custom_runbooks: ["examples/custom_runbooks.yaml"]

Large Language Model (LLM) Configuration

Choose between OpenAI, Azure, AWS Bedrock, and more. Provide the necessary API keys and endpoints for the selected service.

OpenAI
# Configuration for OpenAI LLM
#api_key: "your-secret-api-key"
Azure
# Configuration for Azure LLM
#api_key: "your-secret-api-key"
#model: "azure/<DEPLOYMENT_NAME>"
#you will also need to set environment variables - see above
Bedrock
# Configuration for AWS Bedrock LLM
#model: "bedrock/<MODEL_ID>"
#you will also need to set environment variables - see above

License

Distributed under the MIT License. See LICENSE.txt for more information.

Support

If you have any questions, feel free to message us on robustacommunity.slack.com

How to Contribute

To contribute to HolmesGPT, first follow the Installation instructions for running HolmesGPT from source using Poetry. Then follow an appropriate guide below, or ask us for help on Slack

Adding new runbooks

You can contribute knowledge on solving common alerts and HolmesGPT will use this knowledge to solve related issues. To do so, add a new file to ./holmes/plugins/runbooks - or edit an existing runbooks file in that same directory.

Note: if you prefer to keep your runbooks private, you can store them locally and pass them to HolmesGPT with the -r flag. However, if your runbooks relate to common problems that others may encounter, please consider opening a PR and making HolmesGPT better for everyone!

Adding new toolsets

You can add define new tools in YAML and HolmesGPT will use those tools in it's investigation. To do so, add a new file to ./holmes/plugins/toolsets - or edit an existing toolsets file in that same directory.

Note: if you prefer to keep your tools private, you can store them locally and pass them to HolmesGPT with the -t flag. However, please consider contributing your toolsets! At least one other community member will probably find them useful!

Modifying the default prompts (prompt engineering)

The default prompts for HolmesGPT are located in ./holmes/plugins/prompts. Most holmes commands accept a --system-prompt flag that you can use to override this.

If you find a scenario where the default prompts don't work, please consider letting us know by opening a GitHub issue or messaging us on Slack! We have an internal evaluation framework for benchmarking prompts on many troubleshooting scenarios and if you share a case where HolmesGPT doesn't work, we will be able to add it to our test framework and fix the performance on that issue and similar ones.

Adding new data sources

If you want HolmesGPT to investigate external tickets or alert, you can add a new datasource. This requires modifying the source code and opening a PR. You can see an example PR like that here, which added support for investigating GitHub issues.

holmesgpt's People

Contributors

aantn avatar pavangudiwada avatar avi-robusta avatar arikalon1 avatar sheeproid avatar eltociear avatar milliyin avatar saiyam1814 avatar vahan90 avatar itisallgood avatar kmurthyms avatar

Stargazers

Grégory Vespasien avatar xmento avatar  avatar Rishi Yadav avatar George Pchelkin avatar Andrew Rich avatar Anil Dewani avatar Jérémie Tarot avatar  avatar Phillip Rhodes avatar Dvir Gross avatar Mateusz Ciuła avatar dhamu avatar easing's give ups avatar blue avatar Simon Holm Frandsen avatar Nhan.Ng avatar Jayman Patel avatar  avatar Mujahed Alaqqad avatar 謝佳瑋 avatar  avatar Le Minh Thong avatar Malcolm Jones (bossjones/Tony Dark) avatar Kirill Yurkov avatar Tuan Anh Tran avatar Ashwin Hariharan avatar Siva Naik avatar Kirill Marchenko avatar Antoine Serrano avatar Elizieb avatar Jorge Oliveira avatar Tirumalesh avatar Jack Zhai avatar koolay avatar Ahmad Idrissi avatar Ariel avatar MD SAMI avatar Gabor Kis-Hegedüs avatar Nathaniel Varona avatar  avatar J.H avatar  avatar Caleb Ord avatar Fengrui Liu avatar Eric Sirianni avatar Saadi Myftija avatar Juan Tascon avatar Austin Songer,MIS,CEH,ESCA,Project+ (Navy Veteran) avatar Noam Elfanbaum avatar Eyad Sibai avatar Matt Dinh avatar Kristjan Hiis avatar Shadawck avatar Lulzx avatar  avatar IbrahimMohamedElsayed avatar Ahmed Osama avatar Gianmarco Mameli avatar Nur Arifin Akbar avatar Amr Farid avatar  avatar  avatar Michael Maurer avatar vic avatar Philip Gardner avatar Binh Nguyen avatar Steven avatar Chris Sanders avatar Technetium1 avatar Pablo Dip avatar Katya Gordeeva avatar Nir Shtein avatar  avatar Dariusch Ochlast avatar Tymen avatar  avatar Steven Kessler avatar  avatar Ebrahim Ramadan avatar Felipe Coury avatar Arjun Krishna avatar 姚文强 avatar Roberto Devesa avatar Nicolás Pérez avatar  avatar Leandro Rosa avatar Floris van Lint avatar  avatar Athanasios Liatifis avatar Varun shiva krishna rupani avatar Maksim Fominov avatar Thompson avatar allen.hu avatar st01cs avatar Anton Seledets avatar Jim_Di avatar borederman avatar Dudu D avatar dinçer avatar

Watchers

Steven avatar  avatar  avatar  avatar IbrahimMohamedElsayed avatar  avatar  avatar Roi Glinik avatar

holmesgpt's Issues

[Feature request] - Operator inside Kubernetes.

This would be awesome. Having holmesgpt running inside Kubernetes to report on and eventually self-heal ( to the extend possible ), any issues seen.

Would this be possible. And what about doing something together with https://k8sgpt.ai/? I see some overlap in general between K8sGPT and HolmesGPT.

Thanks

Bad environment variable in README

The instructions for using a self-hosted LLM in the README file say that you need to set the OPENAI_API_BASE variable. This should be OPENAI_BASE_URL to work properly.

Update README for use with docker to have a proper example tag for the container

README.md needs updating to include a valid tag for the docker container so we don't have to go looking for it in the releases or use "latest"

(⎈|vmware-k3s:gitlab-public)
sam@Sam-Office-Desk:~$ docker run -it --net=host -v $(pwd)/config.yaml:/app/config.yaml -v ~/.aws:/root/.aws -v ~/.config/gcloud:/root/.config/gcloud -v $HOME/.kube/config:/root/.kube/config us-central1-docker.pkg.dev/genuine-flight-317411/devel/holmes:latest ask "what pods are unhealthy and why?"
Unable to find image 'us-central1-docker.pkg.dev/genuine-flight-317411/devel/holmes:latest' locally
docker: Error response from daemon: manifest for us-central1-docker.pkg.dev/genuine-flight-317411/devel/holmes:latest not found: manifest unknown: Failed to fetch "latest".
See 'docker run --help'.
(⎈|vmware-k3s:gitlab-public)
sam@Sam-Office-Desk:~$ docker run -it --net=host -v $(pwd)/config.yaml:/app/config.yaml -v ~/.aws:/root/.aws -v ~/.config/gcloud:/root/.config/gcloud -v $HOME/.kube/config:/root/.kube/config us-central1-docker.pkg.dev/genuine-flight-317411/devel/holmes:0.1 ask "what pods are unhealthy and why?"
Unable to find image 'us-central1-docker.pkg.dev/genuine-flight-317411/devel/holmes:0.1' locally
docker: Error response from daemon: manifest for us-central1-docker.pkg.dev/genuine-flight-317411/devel/holmes:0.1 not found: manifest unknown: Failed to fetch "0.1".
See 'docker run --help'.

Improve error messages for config errors

If you don't have an OpenAI API key defined in the config.yaml holmes crashes with this error. I think it should show users a proper error that the key is missing.

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in ask:137                                                                                       │
│                                                                                                  │
│ in load_from_file:210                                                                            │
│                                                                                                  │
│ in load_model_from_file:41                                                                       │
│                                                                                                  │
│ in __init__:55                                                                                   │
│                                                                                                  │
│ in __init__:11                                                                                   │
│                                                                                                  │
│ in __init__:11                                                                                   │
│                                                                                                  │
│ in __init__:8                                                                                    │
│                                                                                                  │
│ in __init__:20                                                                                   │
│                                                                                                  │
│ in _decode_init:31                                                                               │
│                                                                                                  │
│ in _decode:53                                                                                    │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Invalid data type: <class 'NoneType'>, expected dict or list.
[32275] Failed to execute script 'holmes' due to unhandled exception!

Tested on Fedora Linux 40 (Workstation Edition)

New Feature: GitHub Issues

Objective

  • Just inquiring whether it's a needed / useful feature to have holmesgpt work with GitHub issues, in a similar fashion as it works with Jira tickets?
  • If it's useful, I don't mind contributing and doing a PR to add that feature here.

Got error when handling a large file

  • Log file size: 11 MB
  • When asking with a file, an error has raised:
BadRequestError: Error code: 400 - {'error': {'message': "Invalid 'messages[1].content': string too long. Expected a string with maximum length 1048576, but got a string with length 11282561 instead.",
'type': 'invalid_request_error', 'param': 'messages[1].content', 'code': 'string_above_max_length'}}

What max size is allowed?How can I handle a large file?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.