Skip to content

Tools

Introduction to the tools used in this guide.

What are the tools used in this guide?

In this guide, you will use the following tools to demonstrate the MLOps process:

Using another cloud provider? Read this!

This guide has been written with Google Cloud in mind. We are open to contributions to add support for other cloud providers such as Amazon Web Services, Exoscale, Microsoft Azure or Self-hosted Kubernetes but we might not officially support them.

If you want to contribute, please open an issue or a pull request on the GitHub repository. Your help is greatly appreciated!

You will go into details about each tool in the following parts of this guide.

While this guide concentrates solely on the setup and utilization of the mentioned tools, it is worth noting that there are alternative tools available for each stage of the workflow.

Here is a list of related tools that can be explored as alternatives. Additionally, you can find another valuable compilation of tools at https://mlops.toys.

Data management

These are alternatives to DVC.

  • LakeFS - Transform your data lake into a Git-like repository
  • DagsHub - Open Source Data Science Collaboration
  • DoltHub - DoltHub is where people collaboratively build, manage, and distribute structured data
  • Delta Lake - An open-source storage framework that enables building a Lakehouse architecture with compute engines

Monitoring/tracking

These are alternatives to CML.

  • GuildAi - An open source experiment tracking toolkit. Use it to build better machine learning models faster
  • Aim - An open-source, self-hosted ML experiment tracking tool
  • Evidently AI - A first-of-its-kind monitoring tool that makes debugging machine learning models simple and interactive

Data annotation

At the moment, Label Studio is the only solution that allows to annotate many kinds of data. Other competitors only allow a certain kind of data. Have a look at the awesome-data-labeling Git repository for specific alternatives.

Model management/deployment

These are alternatives to BentoML.

  • Kubeflow - The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable
  • MLEM - The open-source tool to simplify your ML model deployments
  • Cog - An open-source tool that lets you package machine learning models in a standard, production-ready container

End-to-end

These tools can be used to manage the entire lifecycle of the ML experiment. These tools were considered at the beginning of this document redaction. But as most of the tools are often opinionated and may lack the flexibility needed for the scope of this project, they were omitted.

  • MLFlow - An open source platform for the machine learning lifecycle
  • MLRun - An open source framework to orchestrate MLOps from the research stage to production-ready AI applications