aporia-ai / mlops.toys Goto Github PK
View Code? Open in Web Editor NEW🎲 A curated list of MLOps projects, tools and resources
Home Page: https://mlops.toys
License: Creative Commons Attribution 4.0 International
🎲 A curated list of MLOps projects, tools and resources
Home Page: https://mlops.toys
License: Creative Commons Attribution 4.0 International
LinkedIn has now released their feature store as open-source, should be added to the feature store section
The Streamlit system seems appropriate in at least model serving
New tool to add Wallaroo.AI - platform to deploy, manage and observe any model at scale across any environment from cloud to edge. Lets you go from python notebook to inferencing in minutes.
MLOPS Platform
Would love to see lakeFS added to this cool project!
Name: lakeFS
Category: Data Versioning
URL: https://lakefs.io
Description:
lakeFS is an open-source data lake management platform that transforms your object storage into a Git-like repository. lakeFS enables you to manage your data lake the way you manage your code. Run parallel pipelines for experimentation and CI/CD for your data.
lakeFS features:
Getting started video: https://www.youtube.com/watch?v=xThorxDzmrw&t=5s
Suggestion to include TerminusDB in the list of MLOps tooling:
TerminusDB is an open-source graph platform and document store. It is designed for building data-intensive applications and knowledge graphs.
Link to introductory video: https://youtu.be/RNeYYvYIZbs
Hey there team Aporia!
Would love for our stack to be featured on the list, but I don't think there's a good category.
Our stack comprises Activeloop Hub , our open-source dataset format for AI (allows for streaming/version-control/querying of data in tensor-based format), as well as our platform that helps visualize, version control, query image, video, and audio data and plug it in TF/PT/other frameworks.
Would you be able to point me in the right direction?
MarkovML is an easiest no-code platform to understand data, streamline AI workflows and build apps to get from data to actionable AI faster. It allows you to:
No-Code Auto EDA: Unlock deep insights from your data using our Auto Data Analyzers powered by AI. Identify data gaps, outliers, and patterns to make informed modeling decisions.
Collaborative Reporting: Together, create and share comprehensive visual reports to eliminate scattered information, siloed knowledge, and disconnected communication.
Build Hosted AI Application: Effortlessly build interactive AI & GenAI Apps from your data using a drag-n-drop interface.
Adopt GenAI With Ease: Boost your speed of innovation by streamlining Generative AI development with our no-code, intuitive drag-and-drop interface - Mizzen.
Seamlessly Versatile: Build a wide range of applications, from summarization and classification to semantic search and Q&A, with just a few clicks.
Build Confidently From Day One: Ensure robust data governance, ensure privacy, and uncompromising security for your AI applications freeing your team to focus on building better AI applications.
Build Custom Workflows: Boost team productivity by effortlessly building complex data workflows from scratch with our intuitive, drag-and-drop interface.
Reusable Workflows: Save effort by using our pre-built templates or reusing and sharing your previously created workflows.
Automate Manual Data Tasks: Save time and effort by automating monotonous data tasks, auto scheduling workflows and eliminating human errors.
Click here to Sign-up for free
Hi folks
As I'm still in the MLOps domain I found myself revisiting the site and repo again to add more project(s) (from iterative.ai), but I see open PRs go back over 1y old...
Is this project/website dead? It would be great to either revive or archive it (and kill the website if so) for clarity since some people may still use this to discover MLOps tools.
If you still want to keep this alive but could may be use assistance with PR reviews/curation every now and then, let me know
(CC @SnirShechter @alongubkin)
Thanks!
Hello,
Many thanks for including Bodywork!
Is it possible to add Training Orchestration
in addition to Model Serving
, as we cover both in equal measures?
Many thanks,
Alex
Datapipe is a real-time, incremental Python ETL library for machine learning with record-level dependency tracking.
The library is designed for describing data processing pipelines and is capable of tracking dependencies for each record in the pipeline. This ensures that tasks within the pipeline receive only the data that has been modified, thereby improving the overall efficiency of data handling.
Incremental Processing: datapipe processes only new or modified data, significantly reducing computation time and resource usage.
Real-time ETL: The library supports real-time data extraction, transformation, and loading.
Dependency Tracking: Automatic tracking of data dependencies and processing states.
Python Integration: Seamlessly integrates with Python applications, offering a Pythonic way to describe data pipelines.
Projects with complex ML pipelines with a human-in-the-loop component
ML projects that require real-time model retraining based on newly labeled data
Projects that require content moderation
https://github.com/epoch8/datapipe – Datapipe Core
https://github.com/epoch8/datapipe-examples/ – Usage examples
Hi, can't help but notice that many of the tools listed can fulfill more than aspect of MLOps. Should this be broken down for each product so its more obvious?
Hi there, thanks for this amazing collection of tools and platforms I was wondering if you may add also FuseML. It's a new project I with other 3 are running as incubation project from SUSE and we are looking for help from other contributors.
We got a micro-webiste (https:// fuseml.github.io) and of cource a GH repo (https://github.com/fuseml).
A brief decription for the project:
FuseML allows you to re-use existing components and tools to create and manage the end-to-end ML lifecycle.
Optimize AI workload with a low-code approach and an extensible framework.
Allows teams to quickly iterate and re-use existing, well known tools for rapid experimentation and fast production releases.
It's all about Open Source, MLOps and fast delivery.
Hope this is enough we got also a youtube channel here
name: MLRun
buttonText: Try MLRun
link: 'https://mlrun.org'
category: Feature Store, Model Monitoring, Model Serving, Training Orchestration, Experiment Tracking
description: >-
MLRun is an end-to-end open-source MLOps solution to manage and automate your entire analytics and machine learning lifecycle, from data ingestion, through model development to full pipeline deployment. MLRun eases the development of machine learning pipelines at scale and helps ML teams build a robust process for moving from the research phase to fully operational production deployments.
Feature and Artifact Store: handles the ingestion, processing, metadata, and storage of data and features across multiple repositories and technologies.
Elastic Serverless Runtimes: converts simple code to scalable and managed microservices with workload-specific runtime engines (such as Kubernetes jobs, Nuclio, Dask, Spark, and Horovod).
ML Pipeline Automation: automates data preparation, model training and testing, deployment of real-time production pipelines, and end-to-end monitoring.
Central Management: provides a unified portal for managing the entire MLOps workflow. The portal includes a UI, a CLI, and an SDK, which are accessible from anywhere.
gitHubRepoName: mlrun/mlrun
youTubeVideoId: _3mxz3zMPpw
Hi maintainers,
We would like a way to add projects to multiple categories which is not possible today (I'm sure MLRun is not the only one).
I feel like this is a pretty basic need 😄
This is in the context of #9
For now I'll add a PR with MLRun in one category
Please add 'Feathr' from linkedin to the list of open source tools
Hi, great idea!
Would it be possible to add a filter so that viewers can choose to display only open source projects?
Cheers,
D.
Check out KitOps, we just launched it to help ease model handoffs between data scientists and app devs or devops folk.
Hi there,
Please help adds the project, PrimeHub. Many thanks.
PrimeHub, a Kubernetes-based collaborative ML platform for teams of data scientists and administrators.
It equips administrators with group-centric managements and eases MLOps for data scientists with pluggable capabilities.
Katonic MLOps Platform is a collaborative platform with a Unified UI to manage all data science activities in one place and introduce MLOps practice into the production systems of customers and developers. It is a collection of cloud-native tools for all of these stages of MLOps:
-Data exploration
-Feature preparation
-Model training/tuning
-Model serving, testing and versioning
Katonic is for both data scientists and data engineers looking to build production-grade machine learning implementations and can be run either locally in your development environment or on a production cluster. Katonic provides a unified system—leveraging Kubernetes for containerization and scalability for the portability and repeatability of its pipelines.
It will be great if you can list it on your account
Website -
Katonic One Pager.pdf
https://katonic.ai/
I'd like to suggest our company as an addition to the MLOps list.
Name: OctoML
Suggested new category: ML model optimization and acceleration
URL: https://octoml.ai/
Description:
OctoML automatically optimizes machine learning models to deliver up to 30x faster inference or prediction time, without sacrificing accuracy.
Deep Learning models optimized with our open source Apache TVM technology have less user-perceived lag, maximize hardware utilization, saving deployment costs, and are energy efficient for edge/IoT devices.
We also comprehensively benchmark customers’ models across CPU, GPU and Accelerator chips to help select the ideal hardware, balancing cost and performance.
How does OctoML speed up your machine learning predictions automatically?
Built on Apache TVM, the OctoML platform does the hard work of automatically making a model production-ready. Our technology uses machine learning to search the space of possible optimizations for a given model, freeing machine learning engineers from having to do it manually using specialized vendor/kernel libraries. It works by running experiments against the target hardware (CPU, GPU etc) to learn how the hardware behaves when certain automatically chosen optimizations are applied. We explore thousands to millions of permutations of a model. When the process is finished, we deliver a fast, energy efficient and accurate model ready to be pushed to production.
Explainer video: https://www.youtube.com/watch?v=gpO4y1mPMWA
Hi,
We are making it super easy for data scientists to deploy AI at inferrd.com. We'd love to be included on your website. Can I create a pull request?
Hello, I would like to make the following text changes to Aim and to add a demo video to our listing https://www.youtube.com/watch?v=g_rxmOiphgw&t=303s&ab_channel=DataTalksClub. Thank you very much!
An easy-to-use & supercharged open-source experiment tracker. Aim logs your training runs, enables a beautiful UI to compare them and an API to query them programmatically.
Why use Aim?
Compare runs easily to build models faster. Group and aggregate 100s of metrics. Analyze and learn correlations. Query with easy pythonic search.
Deep dive into details of each run for easy debugging. Explore hparams, metrics, images, distributions, audio, text, etc. Track plotly and matplotlib plots. Analyze system resource usage.
Have all relevant information centralized for easy governance. Centralized dashboard to view all your runs. Use SDK to query/access tracked runs. You own your data - Aim is open source and self hosted.
Name: Modzy
buttonText: Try Modzy
link: https://www.modzy.com/try-free/
category: Model Monitoring, Model Serving, Experiment Tracking, Explainability
description: >- Modzy is an MLOps platform that accelerates the deployment, integration, and monitoring of production-ready AI.
Features:
gitHubRepo: https://github.com/modzy
YouTube demo link: https://youtu.be/TluT0ZG-QRM
Hello! We have a new MLOPs tool we'd love to add to the mlops.toys!
You can find the repo here: https://github.com/iterative/terraform-provider-iterative
Read the blog post: https://dvc.org/blog/terraform-provider
Watch the video: https://youtu.be/2fEgO8SazSE
Let me know if you need anything else or would like to collaborate in some way!
I'm not 100% sure it fits your criteria, but I think Kedro would be of interest to people who land here
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.