GithubHelp home page GithubHelp logo

Comments (3)

rwynn avatar rwynn commented on May 28, 2024 1

Hi, pushed a new release to back off when indexing errors happen to mitigate the log flooding.

from monstache.

mologie avatar mologie commented on May 28, 2024

Hi, colleague of Manuel here. The specific error message we got was

ERROR 2023/11/24 15:43:43 Bulk response item: {"_index":"main.<col>","_id":"<id>","status":429,"error":{"type":"cluster_block_exception","reason":"index [main.<col>] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];"}}

It was repeated 24 500 000 times in a duration of 10 minutes, totaling roughly 4 GiB of logs.

The steps to reproduce are (though we did not investigate yet whether these can be minimized):

  1. Deny access to the monstache user, so that some data is queued up
  2. Let Elasticsearch run almost full
  3. Stop monstache
  4. Restore access for monstache
  5. Restart monstache
  6. Let Elasticsearch run completely full (up to the flood-stage watermark)
  7. Observe that monstache begins to rapidly generate log events (2+ million log entries per minute)

from monstache.

mologie avatar mologie commented on May 28, 2024

Additionally here is a redacted copy of the config file with which we observed the issue:

mongo-url = "mongodb://monstache:<snip:url>"
elasticsearch-urls = ["http://<snip>:9200"]
direct-read-namespaces = ["main.<snip:col>"]
change-stream-namespaces = ["main.<snip:col>"]
workers = ["worker-0", "worker-1"]
gzip = false
stats = true
index-stats = true
elasticsearch-user = "monstache"
elasticsearch-password = "<snip>"
elasticsearch-max-conns = 4
elasticsearch-validate-pem-file = false
elasticsearch-healthcheck-timeout-startup = 200
elasticsearch-healthcheck-timeout = 200
dropped-collections = true
dropped-databases = true
replay = true
resume = true
resume-write-unsafe = false
resume-name = "default"
resume-strategy = 1
index-files = true
file-highlighting = true
file-namespaces = ["users.fs.files"]
verbose = false
cluster-name = 'elasticsearch'
exit-after-direct-reads = false

I'm curious and investigating possible causes in the source code right now. A brief look tells me that the ElasticSearch library just indiscriminately calls the error handler for everything thrown at it via Add(), so if the ingress side works / provides data we'll end up with one error per ingested item. It's unclear to me however at which point throttling should best take place.

from monstache.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.