Chapter 3.3 - Build and publish the model with BentoML and Docker locally¶
Introduction¶
Serving the model locally is great for testing purposes, but it is not sufficient for production. In this chapter, you will learn how to build and publish the model with BentoML and Docker.
This will allow to share the model with others and deploy it on a Kubernetes in a later chapter.
In this chapter, you will learn how to:
- Create a BentoML model artifact
- Containerize the model artifact with BentoML and Docker
- Test the containerized model artifact by serving it locally with Docker
- Create a container registry that will serve as your model registry
- Publish the containerized model artifact Docker image to the container registry
The following diagram illustrates the control flow of the experiment at the end of this chapter:
flowchart TB
dot_dvc[(.dvc)] <-->|dvc pull
dvc push| s3_storage[(S3 Storage)]
dot_git[(.git)] <-->|git pull
git push| gitGraph[Git Remote]
workspaceGraph <-....-> dot_git
data[data/raw]
subgraph cacheGraph[CACHE]
dot_dvc
dot_git
bento_artifact[(Containerized
artifact)]
end
subgraph remoteGraph[REMOTE]
s3_storage
subgraph gitGraph[Git Remote]
action[Action] <--> |...|repository[(Repository)]
end
registry[(Container
registry)]
end
subgraph workspaceGraph[WORKSPACE]
data --> code[*.py]
subgraph dvcGraph["dvc.yaml"]
code
end
params[params.yaml] -.- code
code <--> bento_model[classifier.bentomodel]
subgraph bentoGraph[bentofile.yaml]
bento_model
serve[serve.py] <--> bento_model
fastapi[FastAPI] <--> |bento serve|serve
end
bentoGraph -->|bento build
bento containerize| bento_artifact
bento_model <-.-> dot_dvc
bento_artifact -->|docker tag
docker push| registry
end
subgraph browserGraph[BROWSER]
localhost <--> |docker run|bento_artifact
localhost <--> |bento serve| fastapi
end
style workspaceGraph opacity:0.4,color:#7f7f7f80
style dvcGraph opacity:0.4,color:#7f7f7f80
style cacheGraph opacity:0.4,color:#7f7f7f80
style data opacity:0.4,color:#7f7f7f80
style dot_git opacity:0.4,color:#7f7f7f80
style dot_dvc opacity:0.4,color:#7f7f7f80
style code opacity:0.4,color:#7f7f7f80
style serve opacity:0.4,color:#7f7f7f80
style bento_model opacity:0.4,color:#7f7f7f80
style fastapi opacity:0.4,color:#7f7f7f80
style params opacity:0.4,color:#7f7f7f80
style s3_storage opacity:0.4,color:#7f7f7f80
style repository opacity:0.4,color:#7f7f7f80
style action opacity:0.4,color:#7f7f7f80
style remoteGraph opacity:0.4,color:#7f7f7f80
style gitGraph opacity:0.4,color:#7f7f7f80
linkStyle 0 opacity:0.4,color:#7f7f7f80
linkStyle 1 opacity:0.4,color:#7f7f7f80
linkStyle 2 opacity:0.4,color:#7f7f7f80
linkStyle 3 opacity:0.4,color:#7f7f7f80
linkStyle 4 opacity:0.4,color:#7f7f7f80
linkStyle 5 opacity:0.4,color:#7f7f7f80
linkStyle 6 opacity:0.4,color:#7f7f7f80
linkStyle 7 opacity:0.4,color:#7f7f7f80
linkStyle 8 opacity:0.4,color:#7f7f7f80
linkStyle 10 opacity:0.4,color:#7f7f7f80
Steps¶
Create a BentoML model artifact¶
A BentoML model artifact (called "Bento" in the documentation) packages your model, code, and environment dependencies into a single file. It is the standard format for saving and sharing ML models.
The BentoML model artifact is described in a bentofile.yaml
file. It contains the following information:
- The service filename and class name
- The Python packages required to run the service
- The Docker configuration, such as the Python version to use
Create a new bentofile.yaml
file in the src
directory with the following content:
src/bentofile.yaml | |
---|---|
Do not forget to include the serve.py
file in the BentoML model artifact. This file contains the code to serve the model with FastAPI as you have seen in the previous chapter.
The python
section contains the Python packages required to run the service. It does not contain DVC and other packages to build the model, as they are not required to run the service.
The docker
section contains the Python version to use. It is important to specify the Python version to ensure the service runs correctly.
Now that the bentofile.yaml
file is created, you can serve the model with the following command:
Build the BentoML model artifact¶
Before containerizing the BentoML model artifact with Docker, you need to build it.
A BentoML model artifact can be built with the following command:
Execute the following command(s) in a terminal | |
---|---|
The output should be similar to this:
All Bentos can be listed with the following command:
The output should be similar to this:
Containerize the BentoML model artifact with Docker¶
Now that the BentoML model artifact is built, you can containerize it with the following command:
Execute the following command(s) in a terminal | |
---|---|
The first :latest
is the tag of the BentoML model artifact. It is a symlink to the latest version of the BentoML model artifact.
The output should be similar to this:
Test the containerized BentoML model artifact locally¶
The BentoML model artifact is now containerized. To verify its behavior, serve the model artifact locally by running the Docker image:
Execute the following command(s) in a terminal | |
---|---|
Congrats! You have successfully containerized the BentoML model artifact using Docker. You have also tested the container by running it locally. The model is now ready to be shared on a container registry.
Create a container registry¶
A container registry is a crucial component that provides a centralized system to manage Docker images. It serves as a repository for storing, versioning, and tracking Docker models built with BentoML, as each version comes with essential metadata, including training data, hyperparameters, and performance metrics.
This comprehensive information ensures reproducibility by preserving historical model versions, which aids in debugging and auditing. Additionally, it promotes transparency and simplifies model comparison and selection for deployment, allowing for seamless integration into production environments.
The model registry also facilitates collaboration among team members, enabling standardized model formats and easy sharing of access. Its support for automated deployment pipelines ensures consistent and reliable model deployment, allowing for an efficient models management.
To improve the deployment process on the Kubernetes server, you will use Google Artifact Registry as the ML model registry to publish and pull Docker images.
Enable the Google Artifact Registry API
You must enable the Google Artifact Registry API to create a container registry on Google Cloud with the following command:
Tip
You can display the available services in your project with the following command:
Execute the following command(s) in a terminal | |
---|---|
Create the Google Container Registry
Export the repository name as an environment variable. Replace <my_repository_name>
with a registy name of your choice. It has to be lowercase and words separated by hyphens.
Warning
The container registry name must be unique across all Google Cloud projects and users. For example, use mlops-<surname>-registry
, where surname
is based on your name. Change the container registry name if the command fails.
Execute the following command(s) in a terminal | |
---|---|
Export the repository location as an environment variable. You can view the available locations at Cloud locations. You should ideally select a location close to where most of the expected traffic will come from. Replace <my_repository_location>
with your own zone. For example, use europe-west6
for Switzerland (Zurich):
Execute the following command(s) in a terminal | |
---|---|
Lastly, when creating the repository, remember to specify the repository format as docker
.
Execute the following command(s) in a terminal | |
---|---|
The output should be similar to this:
This guide has been written with Google Cloud in mind. We are open to contributions to add support for other cloud providers such as Amazon Web Services, Exoscale, Microsoft Azure or Self-hosted Kubernetes but we might not officially support them.
If you want to contribute, please open an issue or a pull request on the GitHub repository. Your help is greatly appreciated!
Login to the remote Container Registry¶
Authenticate with the Google Container Registry
Configure gcloud to use the Google Container Registry as a Docker credential helper.
Execute the following command(s) in a terminal | |
---|---|
Press Y to validate the changes.
Export the container registry host:
Execute the following command(s) in a terminal | |
---|---|
Tip
To get the ID of your project, you can use the Google Cloud CLI.
The output should be similar to this:
Copy the PROJECT_ID
and export it as an environment variable. Replace <my_project_id>
with your own project ID:
Execute the following command(s) in a terminal | |
---|---|
This guide has been written with Google Cloud in mind. We are open to contributions to add support for other cloud providers such as Amazon Web Services, Exoscale, Microsoft Azure or Self-hosted Kubernetes but we might not officially support them.
If you want to contribute, please open an issue or a pull request on the GitHub repository. Your help is greatly appreciated!
Publish the BentoML model artifact Docker image to the container registry¶
The BentoML model artifact Docker image can be published to the container registry with the following commands:
The image is now available in the container registry. You can use it from anywhere using Docker or Kubernetes.
Open the container registry interface on the cloud provider and check that the artifact files have been uploaded.
Open the Artifact Registry on the Google cloud interface and click on your registry to access the details.
This guide has been written with Google Cloud in mind. We are open to contributions to add support for other cloud providers such as Amazon Web Services, Exoscale, Microsoft Azure or Self-hosted Kubernetes but we might not officially support them.
If you want to contribute, please open an issue or a pull request on the GitHub repository. Your help is greatly appreciated!
Check the changes¶
Check the changes with Git to ensure that all the necessary files are tracked:
Execute the following command(s) in a terminal | |
---|---|
The output should look similar to this:
Commit the changes to Git¶
Commit the changes to Git.
Execute the following command(s) in a terminal | |
---|---|
Summary¶
Congratulations! You have successfully prepared the model for deployment in a production environment.
In this chapter, you have successfully:
- Created and containerized a BentoML model artifact
- Published the BentoML model artifact Docker image to the container registry
State of the MLOps process¶
- Notebook has been transformed into scripts for production
- Codebase and dataset are versioned
- Steps used to create the model are documented and can be re-executed
- Changes done to a model can be visualized with parameters, metrics and plots to identify differences between iterations
- Codebase can be shared and improved by multiple developers
- Dataset can be shared among the developers and is placed in the right directory in order to run the experiment
- Experiment can be executed on a clean machine with the help of a CI/CD pipeline
- CI/CD pipeline is triggered on pull requests and reports the results of the experiment
- Changes to model can be thoroughly reviewed and discussed before integrating them into the codebase
- Model can be saved and loaded with all required artifacts for future usage
- Model can be easily used outside of the experiment context
- Model requires manual publication to the artifact registry
- Model is accessible from the Internet and can be used anywhere
- Model requires manual deployment on the cluster
- Model cannot be trained on hardware other than the local machine
- Model cannot be trained on custom hardware for specific use-cases
You will address these issues in the next chapters for improved efficiency and collaboration. Continue the guide to learn how.
Sources¶
Highly inspired by: