Comments (3)
I think the prolongued Riak start-up script doesn't solve this entirely.
When we setup a single-node Riak cluster, the DatalayerService starts successfully, but can't write any data to Riak. In the logs it may show up that it can't create enough replicas. But that error doesn't get propagated to the workflows (management and user). A write only returns with false and the workflow can only guess the reason. One would need to dig down into the logs of DLService to find the cause. Should the platform retry failed writes? Should it maybe propagate back to the client that the workflow invocation failed with a proper reason?
from knix.
Isn't the prolonged riak start-up essentially aiming to prevent a faulty setup to start with?
Of course, there may be other problems during runtime and they would have to handled, perhaps with additional measures.
from knix.
ok, the particular case had a Riak node stuck in "joining". I think @ruichuan added a 5 second sleep in #73 to address this particular case. It'd be safer to fail the pod in the else branch ... but okay, when that ensures the Riak cluster comes up working, we might still have a setup that is degraded for other reasons but the components would be shown as up and running. E.g. when the DL looses connectivity to Riak, it still serves storage operations, but this is not shown to the user.
But agree, this particular issue is on the case that a Riak node is stuck in joining and #73 is supposed to fix this.
/close
from knix.
Related Issues (20)
- CentOS support for ansible deployments HOT 2
- Caching of downloaded function dependencies
- Shutting down sandboxes may hang HOT 5
- Upgrading KNIX breaks the datalayer HOT 1
- workflow logs not available due to elasticsearch sharding problem HOT 1
- test asl_Map hangs with maxconcurrency values != 0 HOT 3
- allow user-supplied data be included in the trigger from TriggersFrontend
- bare metal installation fails due to missing dependencies HOT 22
- error when deploy the knix on a host machine HOT 5
- Does it mean I deploy successful? HOT 1
- error when workflow.excute. requests.exceptions.HTTPError: 405 Client Error: Not Allowed for url: http://XXXX:32777 HOT 10
- knix can not support parallel states. HOT 6
- Error when executing java function with maven dependency of Gson HOT 1
- Handling of client context for remotely handled messages HOT 4
- Provide more detailed error logs when attempting to test functions
- Triggers SDK update (storage, message queues, timers) HOT 1
- Triggers API object update for Java (storage, message queues, timers) HOT 5
- Recovery manager
- KNIX GPU monitoring/accounting capabilities
- Execution log entries get disordered HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from knix.