GithubHelp home page GithubHelp logo

neo4j website down about hetionet HOT 6 CLOSED

hetio avatar hetio commented on August 11, 2024
neo4j website down

from hetionet.

Comments (6)

dhimmel avatar dhimmel commented on August 11, 2024 1

Ah thanks for the heads up. I though we created an uptime check in #45 (comment) that would restart the instance if it became unresponsive like this.

Tagging @falquaddoomi who helped last time. I can restart the instance, but might be good to keep this error active so we can make sure the uptime check detects it. (@falquaddoomi no rush, don't interrupt your weekend).

from hetionet.

falquaddoomi avatar falquaddoomi commented on August 11, 2024 1

Sorry for the trouble you've been having with the sevice, @jromanowska. Also, hey @dhimmel; we do have an uptime check set up for the neo4j instance, but it just reports that the instance is inaccessible, it doesn't reboot it. Also, it's unfortunately very noisy, so it's hard to tell when a real outage is occurring versus a transient network issue on Google's side. I'd assumed since no one complained that these were just transient issues, but apparently not -- I'll look into them as soon as they come up now.

After looking into the logs a bit today, it seems the neo4j instance hits a series of out-of-memory exceptions that cause it to stop being able to fully service requests. Oddly, it'll still serve static resources, just with very high (30 seconds+) latency. I'm going to try bumping up the RAM on the instance, and I'll also add a daemon on the machine itself that checks if https://neo4j.het.io/browser/ is responsive and reboots the docker container if it isn't. I'll keep investigating why this is happening, since if there's a memory leak what I proposed will just delay the outages, not eliminate them.

Perhaps let's keep this issue open for a week or so to see if the issue's resolved, and after that we can close it?

from hetionet.

falquaddoomi avatar falquaddoomi commented on August 11, 2024 1

Just FYI, I've put in a monitoring script that'll reboot the neo4j container if https://neo4j.het.io/browser/ takes longer than 30 seconds to return, or if it returns a non-200 response. I've also increased the RAM on the instance from 8GB to 12GB, and I'll be watching the logs and the uptime check for "transient" issues as well. Here's hoping that the changes I made will improves its stability, but do let me know if any of you have issues with it. 🤞

from hetionet.

falquaddoomi avatar falquaddoomi commented on August 11, 2024 1

Right, the outages shouldn't be more than 5 minutes (that's the current polling interval), and if necessary the entire neo4j container gets restarted, which would reset its memory usage. Fair point about it not being worth tracking down a memory leak in an older version of neo4j. I'll take a look at #33 and see if I can make progress on it.

from hetionet.

dhimmel avatar dhimmel commented on August 11, 2024

Thanks a lot @falquaddoomi! Stoked that we're able to automate the restarts.

I'll keep investigating why this is happening, since if there's a memory leak what I proposed will just delay the outages, not eliminate them.

But the outages will be short-lived and the reboot will reset the memory usage right?

Since the instance is running a pretty old version of Neo4j, there's probably not a ton of value in spending much time diagnosing the memory leak. I played around with upgrading in #33, but was hitting a bunch of problems.

So in summary, don't worry too much about digging into the memory leak unless you think that will create an actionable insight.

from hetionet.

dhimmel avatar dhimmel commented on August 11, 2024

I'll take a look at #33 and see if I can make progress on it

Any help appreciated but a forewarning that there's several things that were breaking: guides, HTTPS, and more. So happy to video chat at any point and give you an overview of the hurdles if that'd be helpful.

from hetionet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.