GithubHelp home page GithubHelp logo

Comments (28)

solatis avatar solatis commented on June 18, 2024 7

To steer the conversation back to the original topic, I think it's important to have at least some high-level maintenance of the project, someone who is able merge PRs, do releases, etcetera.

We are still heavy users of Onyx, but had to fork from the mainline Onyx due to the lack of updates (e.g. Clojure 1.10 support, but also other stuff). I would be happy to volunteer picking up day to day operations for the project, to make sure PRs get reviewed, releases can still be pushed out in a timely manner, etc. As both Michael and Lucas know me, it would also avoid event-stream-like situations.

from onyx.

crimeminister avatar crimeminister commented on June 18, 2024 2

I'll throw in Open Collective as an additional source of potential funding. I am a great fan of Onyx and, while I don't have it running in production, would happily contribute my pittance to help keep it live and growing.

from onyx.

thenonameguy avatar thenonameguy commented on June 18, 2024 2

Unrelated note @neuromantik33 : On the Aeron reliability side it turned that we were hit by a known issue for setting CPU limits in Kubernetes: kubernetes/kubernetes#67577
After disabling those the number of exceptions related to timeouts between Onyx and Aeron fell significantly.

from onyx.

crimeminister avatar crimeminister commented on June 18, 2024 2

There seems to be some more energy being invested into wrappers for Beam / Dataflow, which I appreciate, but Onyx is still the best thing I have personally used in this space.

from onyx.

youvere avatar youvere commented on June 18, 2024 2

I've worked with both in clojure so far (beam/dataflow) and onyx. The simplicity to express pipeline in onyx is impressive

from onyx.

MichaelDrogalis avatar MichaelDrogalis commented on June 18, 2024 1

Thanks for raising this, @thenonameguy. We're glad that Onyx has continued to be useful to the community. @lbradstreet and I have two thoughts:

  1. Your assessment is right. As much as we wish otherwise, @lbradstreet and I don't have the time to invest in Onyx like we used to. We're really proud of what we've built, but there are only so many hours in the day.

  2. Before you consider opening up a source of donations, does anyone have a realistic idea of who could contribute to the project and make use of those funds? Onyx is admittedly a big project, and while it's generally in good shape, anyone who maintains it needs to have a reasonable foundation in distributed systems. We're happy to have others maintain Onyx -- we'd just like to make sure it's in the hands of the right folks.

from onyx.

jgerman avatar jgerman commented on June 18, 2024 1

@thenonameguy I don't want to pollute this Issue so I made a new one: #890

I'm curious what reliability issues you saw, we've been fighting aeron exceptions for a few weeks now.

from onyx.

arnaudbos avatar arnaudbos commented on June 18, 2024 1

@solatis if you want to explicit any guidelines for opening issues, submitting PRs, which communication channel(s) to use, branching model (, etc.) that would work best for you, please advise.
I'd like to contribute here and there when I can.

from onyx.

solatis avatar solatis commented on June 18, 2024 1

@MichaelDrogalis @lbradstreet I think I still need additional instructions / access before I can push a release to Clojars. I see there's quite a bit of magic in a lot of places to make this happen.

I'm available over Slack if you want to reach out to me directly.

from onyx.

thenonameguy avatar thenonameguy commented on June 18, 2024

Thanks for the response!
I posted this discussion on the #clojure Slack channel, maybe someone shows interest in becoming a regular maintainer. As an alternative of this solution, it came to my mind that we could fund development on a per feature/bug basis (for example via Bountysource), leaving the solution evaluation to your free time, keeping complete ownership of the codebase you created.

I think our company would fund some of the proposals that are in your backlog over time.

from onyx.

the2bears avatar the2bears commented on June 18, 2024

Would love to help out. I just need to figure out a way through the red tape at work.

from onyx.

MichaelDrogalis avatar MichaelDrogalis commented on June 18, 2024

As much as we'd like to, @lbradstreet and I are booked up on time. That said, we don't feel the need to have complete ownership over the codebase. We're more than happy to have other maintainers take the wheel. The only thing we'd like to avoid is passing it to someone at random (and avoid an event-stream-like situation :) )

from onyx.

neuromantik33 avatar neuromantik33 commented on June 18, 2024

We @ Oscaro use Onyx in production and would also like to see it continue as we've invested considerable resources into building pipelines and IOs (Google Cloud which we plan to open source). We can chip in but I'll admit that some of the codebase needs some serious documentation and/or more tests as we have noticed things (my colleague here) that really require that @MichaelDrogalis or any of the original authors shed some light on. The lifecycle FSM and some of the trickyness involved can be quite daunting without any annotations.
Oh and aeron is also up there in complexity and instability, sometimes we were thinking of just switching to another transport with an easier learning curve like GRPC and more cloud friendly. Anyhow glad to see things finally moving around here :)

from onyx.

lbradstreet avatar lbradstreet commented on June 18, 2024

@neuromantik33, it's pretty fair to describe the lifecycle FSM as daunting. We only had so much time, and we needed to make performance somewhat comparable to Flink, so I'm afraid some of the readability was sacrificed in favor of performance there.

I personally found Aeron pretty reasonable to run reliably, however it was certainly susceptible to GC issues if run in process, and I could see it being pretty easy for the clients timing out themselves when the peers GC. It may be worth starting out by trying to tune the Aeron client and media drivers in a way that would make the behavior more tolerant normal streaming workloads. That said, the aspect of running a separate media driver process and peer-group do make it a little harder to setup for the first time.

Ultimately, building a full streaming platform, including plugins, with only a few people is a big job without significant community involvement. I think the important idea is the data driven, flexible DSL. It may be worth investigating whether the DSL aspects could be mapped onto something like Flink, so you can get the benefit of the runtime and community engagement, without the overhead of building all of the aspects of a distributed streaming platform.

from onyx.

arnaudbos avatar arnaudbos commented on June 18, 2024

While I certainly like Flink, IMO Onyx shines not only because of its data model but also because of its "just a library" approach.
Flink is a framework and although they have a FLIP and a few JIRA tickets opened to take advantage of so-called "container modes" for resource management and auto-scaling, I think Onyx is in a really good position to offer these kinds of features a la carte (i.e. not baked into onyx core), am I right?

That said, I've been in and out of exploring onyx core and plugins for a few weeks and I would be glad to help people with more experience in the distsys field.
A "beginner friendly" tag on issues and a guided tour of the code base would be really nice.
EDIT: I notice there's a "newbie" label, fair enough.

from onyx.

MichaelDrogalis avatar MichaelDrogalis commented on June 18, 2024

@arnaudbos In the same vein as @lbradstreet's suggestion, I always thought it would be interesting to map the Onyx information model onto Kafka Streams, since that one truly is just a library.

from onyx.

arnaudbos avatar arnaudbos commented on June 18, 2024

@MichaelDrogalis indeed, describing a Kafka Stream topology as an Onyx job would allow to decomplect (sorry) Processor nodes, especially with lifecycles.

  1. What about peer capabilities though?

Storm has a "tag-aware scheduler" and in Flink, the API provides a way to assign operators to specific slots (I think they're called slots) with specific parallelism.
AFAIK (unless it's somewhere in a KIP), Kafka Streams doesn't have a way to specify a custom Stream Thread "assigner" (don't know which word to use: partitioner, scheduler, they're all overloaded...) and every instance runs all the processors.
Onyx has peer and task tags which is really valuable (in the Clojure community at least, as illustrated in the docs by a Datomic license, but also for uneven task loads).

  1. What about savepoints?

Onyx and Flink both use ABS for state checkpointing and Flink has implemented user triggered savepoints on top of it.
I still have to wrap my head around that concept (and oh boy dig a lot to understand the implementation details...), but something tells me it would be feasible in Onyx, while in Kafka Streams there's an open ticket somewhere but is not implemented yet.

  1. Quick question:

Does Onyx currently support what Flink calls "End-to-End Exactly-Once Processing"? I think not. Their two-phase commit abstraction is interesting but I'm not sure how it could be mapped to Onyx's masterless design since Flink relies on the Job manager for such purpose.

  1. Thank you

I think many of us, in the Clojure/Onyx "community", know you have a lot to do at Confluent, so thank you for taking the time to discuss here.

Mapping the Onyx information model on top of Flink/Kafka Streams or trying to go forward is a matter of tradeoffs it seems.
People relying on it for production will probably be driven to keep it going, I personally like the learning aspect of it.

from onyx.

MichaelDrogalis avatar MichaelDrogalis commented on June 18, 2024

Full disclose before I dig in -- a large part of my current role at Confluent is managing Kafka Streams and KSQL. I do have a mild vested interest when I suggest this idea, though I do think it's a good one anyway. :)

  1. KStreams leans on orchestration tools like K8s to manage which applications are participating in particular flows. I think this is ultimately the right solution. Had K8s been more mature when Onyx was being developed, I probably wouldn't have implemented tags, and instead documented some recipes about how to do this with other tools that are more capable.

  2. KStreams doesn't use savepoints, but it does back up its state to underlying Kafka changelog topics. In practice, this yields about the same result that matters: being able to restore state across applications.

  3. I can't remember where we left off with the implementation, but any end-to-end exactly-once semantics would need to depend on the sources and sinks providing some notion of exactly-once, too. We definitely didn't cover that for all supported plugins.

  4. Thanks so much for saying that and supporting Onyx. This project has been a brilliant piece of my life, largely thanks to everyone who took an interest in it. It would be great to see it continue in any form since I do think there are a lot of good, small ideas that make up the whole project.

from onyx.

MichaelDrogalis avatar MichaelDrogalis commented on June 18, 2024

@solatis I'd be very happy to have @solatis aboard. I can set up full access if @lbradstreet agrees.

from onyx.

lbradstreet avatar lbradstreet commented on June 18, 2024

@solatis, that sounds great! I can walk you through the CI and release process some time too.

from onyx.

solatis avatar solatis commented on June 18, 2024

Great let’s discuss the logistics over direct message.

from onyx.

MichaelDrogalis avatar MichaelDrogalis commented on June 18, 2024

@solatis I have invited you as an owner of the onyx-platform GitHub organization.

from onyx.

MichaelDrogalis avatar MichaelDrogalis commented on June 18, 2024

Hey @solatis. I've added you to the Clojars org. You should be able to trigger releases now with the release scripts (they're under script/) in each repo. Can still answer any questions as needed though.

from onyx.

solatis avatar solatis commented on June 18, 2024

My clojars username is also @solatis .

What is the normal release process like? I manually updated relevant files and tagged branches, and it seems like a deploy of 0.14.5 to Clojars did succeed, but CircleCI generated a permission / connection failed error => https://circleci.com/gh/onyx-platform/onyx/6868

It could be an unfortunate network glitch, though.

from onyx.

thenonameguy avatar thenonameguy commented on June 18, 2024

Hey @solatis! πŸ‘‹
Would you be interested in pushing your internal fork of Onyx with your improvements (clj 1.10 support, etc.)? We are also pretty close to forking the repo for our own needs, it would be nice if we had a more up-to-date base, even if it contains some breakage.

from onyx.

solatis avatar solatis commented on June 18, 2024

@thenonameguy are there any specific features you are looking for? I've merged most of the things, and we're using Onyx + clojure 1.10 ourselves without any problems at this point.

from onyx.

matanox avatar matanox commented on June 18, 2024

Experience with similar acquihires shows that the project simply dies over the course of few years, regardless of potentially faint reassuring statements being declared early on or not, whereas a new (differently commercial) similar project coming from the acquiring company does not always emerge. It might be good to assume something along these lines, but this is just my small piece of mind when coming here to check on the progress of this framework.

from onyx.

RBerkheimer avatar RBerkheimer commented on June 18, 2024

Like Mike said last year(?) or when he and Lucas left, he believes the information model Onyx built was great, and that's what should be persistent. As it stands, those who use Onyx, what would be needed to continue to ensure Onyx maintains a growth and security posture? Extension to other languages? Improved plugins? Improved security posture? What is needed to make this project considered 'active'?

from onyx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.