Chapter 2.1 - Move the ML experiment code to the cloud¶

Introduction¶

Now that you have configured DVC and can reproduce the experiment, let's set up a remote repository for sharing the code with the team.

By linking your local project to a remote repository on platforms like GitHub or GitLab, you can easily push, pull, and synchronize changes with your team.

The following diagram illustrates the control flow of the experiment at the end of this chapter:

flowchart TB
    dot_dvc[(.dvc)]
    dot_git[(.git)] <-->|git push
                         git pull| gitGraph[Git Remote]
    workspaceGraph <-....-> dot_git
    data[data/raw] <-.-> dot_dvc
    subgraph remoteGraph[REMOTE]
        subgraph gitGraph[Git Remote]
            repository[(Repository)]
        end
    end
    subgraph cacheGraph[CACHE]
        dot_dvc
        dot_git
    end
    subgraph workspaceGraph[WORKSPACE]
        prepare[prepare.py] <-.-> dot_dvc
        train[train.py] <-.-> dot_dvc
        evaluate[evaluate.py] <-.-> dot_dvc
        data --> prepare
        subgraph dvcGraph["dvc.yaml (dvc repro)"]
            prepare --> train
            train --> evaluate
        end
        params[params.yaml] -.- prepare
        params -.- train
        params <-.-> dot_dvc
    end
    style workspaceGraph opacity:0.4,color:#7f7f7f80
    style dvcGraph opacity:0.4,color:#7f7f7f80
    style cacheGraph opacity:0.4,color:#7f7f7f80
    style dot_dvc opacity:0.4,color:#7f7f7f80
    style data opacity:0.4,color:#7f7f7f80
    style prepare opacity:0.4,color:#7f7f7f80
    style train opacity:0.4,color:#7f7f7f80
    style evaluate opacity:0.4,color:#7f7f7f80
    style params opacity:0.4,color:#7f7f7f80
    linkStyle 1 opacity:0.4,color:#7f7f7f80
    linkStyle 2 opacity:0.4,color:#7f7f7f80
    linkStyle 3 opacity:0.4,color:#7f7f7f80
    linkStyle 4 opacity:0.4,color:#7f7f7f80
    linkStyle 5 opacity:0.4,color:#7f7f7f80
    linkStyle 6 opacity:0.4,color:#7f7f7f80
    linkStyle 7 opacity:0.4,color:#7f7f7f80
    linkStyle 8 opacity:0.4,color:#7f7f7f80
    linkStyle 9 opacity:0.4,color:#7f7f7f80
    linkStyle 10 opacity:0.4,color:#7f7f7f80
    linkStyle 11 opacity:0.4,color:#7f7f7f80

Create a remote Git repository¶

Create a Git repository on your preferred service to collaborate with peers. For example, choose mlops-guide as repository name.

GitHub GitLab

Important

Configure the repository as you wish but do not check the box "Add a README file", "Add .gitignore" nor "Choose a license".

Create a new GitHub repository for this chapter by accessing https://github.com/new.

Important

Configure the repository as you wish but do not check the box "Initialize repository with a README".

Create a new GitLab blank project for this chapter by accessing https://gitlab.com/projects/new.

Configure Git for the remote branch¶

Using the SSH protocol, you can connect and authenticate to your Git service provider without supplying your username and personal access token each time your want to share your changes.

GitHub GitLab

Generate a SSH key pair and configure your SSH access using Connecting to GitHub with SSH.

Add the remote origin to your Git repository using SSH. For example, replace <my_git_repository_url> with git@github.com:<my_username>/<my_repository_name>.git.

Generate a SSH key pair and configure your SSH access using Use SSH keys to communicate with GitLab.

Add the remote origin to your Git repository using SSH. For example, replace <my_git_repository_url> with git@gitlab.com:<my_username>/<my_repository_name>.git.

Execute the following command(s) in a terminal
# Add the remote origin
git remote add origin <my_git_repository_url>

Push the changes to Git¶

Set the remote as the upstream branch and push the changes to Git:

Execute the following command(s) in a terminal
# Set remote origin and push the changes
git push -u origin main

After setting the upstream branch, you can simply use git push and git pull without additional arguments to interact with the remote branch.

Check the results¶

Go to your online Git repository and you will be able to view the files that are stored there.

This chapter is now complete. Please review the summary for a recap of the key points.

Summary¶

Congratulations! You now have a codebase that can be used and shared among the team.

In this chapter, you have successfully:

Set up a remote Git repository
Added the remote to your local git repository
Pushed your changes to the remote Git repository

You fixed some of the previous issues:

Codebase no longer needs manual download and is versioned

Another member of your team can easily clone the experiment with the following command:

Execute the following command(s) in a terminal
# Clone the Git repository
git clone <my_git_repository_url>

You can now safely continue to the next chapter.

State of the MLOps process¶

You will address these issues in the next chapters for improved efficiency and collaboration. Continue the guide to learn how.

Sources¶

Highly inspired by: