GithubHelp home page GithubHelp logo

aporia-ai / mlops.toys Goto Github PK

View Code? Open in Web Editor NEW
183.0 183.0 37.0 2.43 MB

🎲 A curated list of MLOps projects, tools and resources

Home Page: https://mlops.toys

License: Creative Commons Attribution 4.0 International

JavaScript 28.69% SCSS 2.17% Vue 69.14%
awesome awesome-list data-science list machine-learning mlops

mlops.toys's People

Contributors

alex000kim avatar alexiguazio avatar alongubkin avatar aporia-oncall avatar bmunday3-zz avatar deanp70 avatar dleybz avatar gblpedia avatar idonov8 avatar kwj2104 avatar marcin-laskowski avatar ncilfone avatar nicarod avatar omesser avatar peacing avatar rasapala avatar shourysharma avatar skogstrom avatar snyk-bot avatar weiloon-datature avatar yanivzoh avatar yannickperrenet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

mlops.toys's Issues

New Tool: Wallaroo.AI

New tool to add Wallaroo.AI - platform to deploy, manage and observe any model at scale across any environment from cloud to edge. Lets you go from python notebook to inferencing in minutes.

Add lakeFS

Would love to see lakeFS added to this cool project!

Name: lakeFS
Category: Data Versioning
URL: https://lakefs.io

Description:
lakeFS is an open-source data lake management platform that transforms your object storage into a Git-like repository. lakeFS enables you to manage your data lake the way you manage your code. Run parallel pipelines for experimentation and CI/CD for your data.

lakeFS features:

  • Exabytes scale version control
  • Git-like operations: branch, commit, merge, revert
  • Zero copy branching for frictionless experiments
  • Full reproducibility of data and code
  • Pre-commit/merge hooks for data CI/CD
  • Instantly revert changes to data

Getting started video: https://www.youtube.com/watch?v=xThorxDzmrw&t=5s

Add TerminusDB

Suggestion to include TerminusDB in the list of MLOps tooling:

TerminusDB is an open-source graph platform and document store. It is designed for building data-intensive applications and knowledge graphs.

  • Quickly build versioned, bitemporal data products and give access to your domain teams
  • Visually construct data models, which are easy, maintainable, and enforced
  • Share your work and collaborate with colleagues
  • Versioning first - a full audit log, with commit history. You can see how data has changed, query diffs, and roll-back errors. You can also time-travel to any point in the data's history so you always know what happened.
  • Data lineage - where data comes from and how it got here

Link to introductory video: https://youtu.be/RNeYYvYIZbs

Try TerminusDB

Add Activeloop

Hey there team Aporia!

Would love for our stack to be featured on the list, but I don't think there's a good category.

Our stack comprises Activeloop Hub , our open-source dataset format for AI (allows for streaming/version-control/querying of data in tensor-based format), as well as our platform that helps visualize, version control, query image, video, and audio data and plug it in TF/PT/other frameworks.

Would you be able to point me in the right direction?

MarkovML - Data to Gen AI Faster

MarkovML is an easiest no-code platform to understand data, streamline AI workflows and build apps to get from data to actionable AI faster. It allows you to:

  • Perform Data Analysis: Quickly analyze text-based datasets just in a few clicks without using code.

No-Code Auto EDA: Unlock deep insights from your data using our Auto Data Analyzers powered by AI. Identify data gaps, outliers, and patterns to make informed modeling decisions.
Collaborative Reporting: Together, create and share comprehensive visual reports to eliminate scattered information, siloed knowledge, and disconnected communication.

  • Easy Organization And Discovery: Use our Intelligent Data Catalog to effortlessly organize AI data, metrics, and insights from all your ML workflows in one centralized place for seamless discovery, traceability, and lineage.

Build Hosted AI Application: Effortlessly build interactive AI & GenAI Apps from your data using a drag-n-drop interface.
Adopt GenAI With Ease: Boost your speed of innovation by streamlining Generative AI development with our no-code, intuitive drag-and-drop interface - Mizzen.
Seamlessly Versatile: Build a wide range of applications, from summarization and classification to semantic search and Q&A, with just a few clicks.
Build Confidently From Day One: Ensure robust data governance, ensure privacy, and uncompromising security for your AI applications freeing your team to focus on building better AI applications.

  • Automate Workflows: Create automated workflows using our intuitive no code workflow builder.

Build Custom Workflows: Boost team productivity by effortlessly building complex data workflows from scratch with our intuitive, drag-and-drop interface.
Reusable Workflows: Save effort by using our pre-built templates or reusing and sharing your previously created workflows.
Automate Manual Data Tasks: Save time and effort by automating monotonous data tasks, auto scheduling workflows and eliminating human errors.

Click here to Sign-up for free

Screenshot 2023-12-26 at 10 33 51 AM

Hello? Is this abandoned? 😅

Hi folks
As I'm still in the MLOps domain I found myself revisiting the site and repo again to add more project(s) (from iterative.ai), but I see open PRs go back over 1y old...

Is this project/website dead? It would be great to either revive or archive it (and kill the website if so) for clarity since some people may still use this to discover MLOps tools.
If you still want to keep this alive but could may be use assistance with PR reviews/curation every now and then, let me know

(CC @SnirShechter @alongubkin)
Thanks!

Additional label/tag for Bodywork

Hello,

Many thanks for including Bodywork!

Is it possible to add Training Orchestration in addition to Model Serving, as we cover both in equal measures?

Many thanks,

Alex

Datapipe

Datapipe is a real-time, incremental Python ETL library for machine learning with record-level dependency tracking.

The library is designed for describing data processing pipelines and is capable of tracking dependencies for each record in the pipeline. This ensures that tasks within the pipeline receive only the data that has been modified, thereby improving the overall efficiency of data handling.

https://datapipe.dev/

Key Features:

  • Incremental Processing: datapipe processes only new or modified data, significantly reducing computation time and resource usage.

  • Real-time ETL: The library supports real-time data extraction, transformation, and loading.

  • Dependency Tracking: Automatic tracking of data dependencies and processing states.

  • Python Integration: Seamlessly integrates with Python applications, offering a Pythonic way to describe data pipelines.

Ideal projects for Datapipe

  • Projects with complex ML pipelines with a human-in-the-loop component

  • ML projects that require real-time model retraining based on newly labeled data

  • Projects that require content moderation

Github

https://github.com/epoch8/datapipe – Datapipe Core

https://github.com/epoch8/datapipe-examples/ – Usage examples

Screenshots

1707253126680-1707253124483Screenshot-2024-02-06-at-15 40 36
1707204445926-1707204444758Screenshot-2024-01-08-at-16 09 37

Logo

1707254356701-1707254355917logo_monochrome

Overlapping features

Hi, can't help but notice that many of the tools listed can fulfill more than aspect of MLOps. Should this be broken down for each product so its more obvious?

[Request] FuseML - Open Source AI Orchestrato

Hi there, thanks for this amazing collection of tools and platforms I was wondering if you may add also FuseML. It's a new project I with other 3 are running as incubation project from SUSE and we are looking for help from other contributors.
We got a micro-webiste (https:// fuseml.github.io) and of cource a GH repo (https://github.com/fuseml).

A brief decription for the project:
FuseML allows you to re-use existing components and tools to create and manage the end-to-end ML lifecycle.
Optimize AI workload with a low-code approach and an extensible framework.
Allows teams to quickly iterate and re-use existing, well known tools for rapid experimentation and fast production releases.
It's all about Open Source, MLOps and fast delivery.

Hope this is enough we got also a youtube channel here

Adding MLRun

name: MLRun
buttonText: Try MLRun
link: 'https://mlrun.org'
category: Feature Store, Model Monitoring, Model Serving, Training Orchestration, Experiment Tracking
description: >-
MLRun is an end-to-end open-source MLOps solution to manage and automate your entire analytics and machine learning lifecycle, from data ingestion, through model development to full pipeline deployment. MLRun eases the development of machine learning pipelines at scale and helps ML teams build a robust process for moving from the research phase to fully operational production deployments.

  • Feature and Artifact Store: handles the ingestion, processing, metadata, and storage of data and features across multiple repositories and technologies.

  • Elastic Serverless Runtimes: converts simple code to scalable and managed microservices with workload-specific runtime engines (such as Kubernetes jobs, Nuclio, Dask, Spark, and Horovod).

  • ML Pipeline Automation: automates data preparation, model training and testing, deployment of real-time production pipelines, and end-to-end monitoring.

  • Central Management: provides a unified portal for managing the entire MLOps workflow. The portal includes a UI, a CLI, and an SDK, which are accessible from anywhere.

gitHubRepoName: mlrun/mlrun
youTubeVideoId: _3mxz3zMPpw

logo:
pose2-small

Add support for multiple categories per project

Hi maintainers,
We would like a way to add projects to multiple categories which is not possible today (I'm sure MLRun is not the only one).
I feel like this is a pretty basic need 😄

This is in the context of #9
For now I'll add a PR with MLRun in one category

Filter by open source

Hi, great idea!

Would it be possible to add a filter so that viewers can choose to display only open source projects?

Cheers,
D.

[Request] Missing project - PrimeHub

Hi there,

Please help adds the project, PrimeHub. Many thanks.

PrimeHub, a Kubernetes-based collaborative ML platform for teams of data scientists and administrators.

  • Cluster Computing
  • One-Click Notebook Environments
  • Group-centric Dataset Management / Resources Management / Access-control Management
  • Custom Machine Learning Environments
  • Model Tracking and Deployment
  • Capability Augmentation with 3rd-party Apps

It equips administrators with group-centric managements and eases MLOps for data scientists with pluggable capabilities.

Try PrimeHub CE
Try PrimeHub

Automate your cycle of Intelligence

Katonic MLOps Platform is a collaborative platform with a Unified UI to manage all data science activities in one place and introduce MLOps practice into the production systems of customers and developers. It is a collection of cloud-native tools for all of these stages of MLOps:

-Data exploration
-Feature preparation
-Model training/tuning
-Model serving, testing and versioning
Katonic is for both data scientists and data engineers looking to build production-grade machine learning implementations and can be run either locally in your development environment or on a production cluster. Katonic provides a unified system—leveraging Kubernetes for containerization and scalability for the portability and repeatability of its pipelines.

It will be great if you can list it on your account

Website -
Katonic One Pager.pdf
https://katonic.ai/

New Project/Company recommendation

I'd like to suggest our company as an addition to the MLOps list.

Name: OctoML
Suggested new category: ML model optimization and acceleration

URL: https://octoml.ai/

Description:
OctoML automatically optimizes machine learning models to deliver up to 30x faster inference or prediction time, without sacrificing accuracy.

Deep Learning models optimized with our open source Apache TVM technology have less user-perceived lag, maximize hardware utilization, saving deployment costs, and are energy efficient for edge/IoT devices.

We also comprehensively benchmark customers’ models across CPU, GPU and Accelerator chips to help select the ideal hardware, balancing cost and performance.

How does OctoML speed up your machine learning predictions automatically?
Built on Apache TVM, the OctoML platform does the hard work of automatically making a model production-ready. Our technology uses machine learning to search the space of possible optimizations for a given model, freeing machine learning engineers from having to do it manually using specialized vendor/kernel libraries. It works by running experiments against the target hardware (CPU, GPU etc) to learn how the hardware behaves when certain automatically chosen optimizations are applied. We explore thousands to millions of permutations of a model. When the process is finished, we deliver a fast, energy efficient and accurate model ready to be pushed to production.

Explainer video: https://www.youtube.com/watch?v=gpO4y1mPMWA

Missing project request!

Hi,

We are making it super easy for data scientists to deploy AI at inferrd.com. We'd love to be included on your website. Can I create a pull request?

Changes to Aim listing

Hello, I would like to make the following text changes to Aim and to add a demo video to our listing https://www.youtube.com/watch?v=g_rxmOiphgw&t=303s&ab_channel=DataTalksClub. Thank you very much!


An easy-to-use & supercharged open-source experiment tracker. Aim logs your training runs, enables a beautiful UI to compare them and an API to query them programmatically.

Why use Aim?

Compare runs easily to build models faster. Group and aggregate 100s of metrics. Analyze and learn correlations. Query with easy pythonic search.

Deep dive into details of each run for easy debugging. Explore hparams, metrics, images, distributions, audio, text, etc. Track plotly and matplotlib plots. Analyze system resource usage.

Have all relevant information centralized for easy governance. Centralized dashboard to view all your runs. Use SDK to query/access tracked runs. You own your data - Aim is open source and self hosted.

Add Syndicai

Hey, Great initiative!

Would be amazing to add Syndicai there. I already prepared a PR, so you can just have a look!

Adding Modzy

Name: Modzy
buttonText: Try Modzy
link: https://www.modzy.com/try-free/
category: Model Monitoring, Model Serving, Experiment Tracking, Explainability
description: >- Modzy is an MLOps platform that accelerates the deployment, integration, and monitoring of production-ready AI.

Features:

  • Easy model deployment and monitoring for data scientists, with drift detection and explainability
  • APIs and SDKs in Python, Java, Javascript, and Go for developers to integrate AI models into any application
  • Support for cloud, on-premise, edge or hybrid deployments, with military-grade security

gitHubRepo: https://github.com/modzy
YouTube demo link: https://youtu.be/TluT0ZG-QRM

MODZY-RGB-POS

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.