Bring an existing SageMaker Managed MLFlow into Amazon SageMaker Unified Studio

5 min readJan 22, 2025

TL;DR: You have already created a SageMaker-managed MLFlow Tracking Server and now you want to use it with SageMaker Unified Studio. In just two CLI commands, you will be able to!

Image courtesy of mlflow.org

What is Amazon SageMaker Managed MLFlow?

Amazon SageMaker’s fully managed MLflow capability, which became generally available in June 2024, offers a streamlined solution for managing the complete machine learning lifecycle. The service allows data scientists and ML developers to easily set up and manage MLflow Tracking Servers with minimal effort. The offering consists of three core components:

  • an MLflow Tracking Server that serves REST API endpoints for monitoring experiments,
  • a metadata store for persisting experiment-related information
  • and an artifact store using Amazon S3 for secure storage of ML artifacts.

This managed service eliminates the undifferentiated heavy lifting of infrastructure management, provides comprehensive experiment tracking across various environments, supports full MLflow capabilities, and offers enhanced security through AWS IAM integration. The service is designed to scale efficiently, with a “Small” tracking server supporting teams of up to 25 users.

What is SageMaker Unified Studio?

Amazon SageMaker Unified Studio is a comprehensive environment that integrates data and AI tools for complete development workflows. It provides a unified experience for model development, generative AI app development, data processing, and SQL analytics in a single governed environment. For ML workflows specifically, it leverages SageMaker AI to offer fully managed infrastructure, tools, and workflows for each step of the model lifecycle. This includes data preparation, training, governance, MLOps, inference, experimentation, pipelines, and model monitoring and evaluation. Users can access their data stored in various sources like Amazon S3, Amazon Redshift, and other data sources through the Amazon SageMaker Lakehouse. The platform also integrates with Amazon Q Developer to assist with tasks across the development lifecycle, including data discovery, ML model building, and code authoring, making the entire ML workflow more streamlined and efficient.

How to create/connect MLFlow Tracking Server in SageMaker Unified Studio

There are two ways to make sure you can use MLFlow Tracking Server in SageMaker Unified Studio. Both methods assume you’ve already created your Project using your preferred project profile. To learn how to do so, check out the SageMaker Unified Studio documentation.

1. Create a new MLFlow Tracking Server directly in SageMaker Unified Studio

Super easy — takes less than 10 seconds!

If you have the right permissions (AKA your project profile allows you), it takes less than 10 seconds to create a new MLFlow Tracking Server directly from the SageMaker Unified Studio UI:

  1. Open your Project
  2. Click on Compute
  3. Select MLFlow Tracking Servers
  4. Click Create MLFlow Tracking Server
  5. Select your configuration and provide a name
  6. Click on Create MLFlow Tracking Server
  7. Wait ~10–15 minutes

Your MLFlow Tracking Server is ready to go! All you need to do is to add the right code in your notebook/training job to track your experiments. Before getting into the code, make sure you retrieve the MLFlow Tracking Server ARN, by clicking the Copy ARN button in the Compute tab.

Select this button

2. Import an existing MLFLow Tracking Server into SageMaker Unified Studio

Let’s assume you’ve created elsewhere your MLFlow Tracking Server, and now you want to be able to access it from SageMaker Unified Studio. For example, let’s say you’ve created the MLFlow Tracking Server using the CLI command:

aws sagemaker create-mlflow-tracking-server \
--tracking-server-name my-manually-created-tracking-server \
--artifact-store-uri s3://[YOUR-BUCKET]/[YOUR-PATH]/mlflow \
--role-arn arn:aws:iam::[ACCOUNT-ID]:role/[YOUR-ROLE]

In order to be able to visualize it in SageMaker Unified Studio, you need to tag it accordingly using three tags — Environment ID, Project ID and Domain ID. Two of them, Project ID and Domain ID, are available in the Project overview page, as indicated in the image below.

Where to find Project ID and Domain ID

To get the Environment ID, it’s a bit more complicated. First of all, make sure your Project has the Blueprint for MLExperiments attached to it.

Then, run this command line using the info obtained at the previous step:

aws datazone list-environments \
--domain-identifier [DOMAIN-ID-#3] \
--project-identifier [PROJECT-ID-#2] \
| jq -r '.items[] | select(.name == "MLExperiments") | .id'

Copy this value, and use it for this CLI command:

aws sagemaker add-tags \
--resource-arn arn:aws:sagemaker:[YOUR-REGION]:[YOUR-ACCOUNT-ID]:mlflow-tracking-server/[TRACKING-SERVER-NAME] \
--tags \
Key=AmazonDataZoneEnvironment,Value=[ENVIRONMENT-ID-#1] \
Key=AmazonDataZoneProject,Value=[PROJECT-ID-#2] \
Key=AmazonDataZoneDomain,Value=[DOMAIN-ID-#3]

That’s it! Now you should be able to see your MLFlow Tracking Server in the Compute tab of your Project:

How to use MLFLow Tracking Server in SageMaker Unified Studio

In your code, make sure you’ve added these lines:

import mlflow
from mlflow.models import infer_signature

mlflow_arn = "[REPLACE-WITH-THE-ARN-YOU-HAVE-JUST-COPIED]"
mlflow.set_tracking_uri(mlflow_arn)
mlflow.set_experiment("my-experiment-name")

with mlflow.start_run() as run:
[.... training code goes here ....]
signature = infer_signature(X, model.predict(X))
mlflow.log_params(params)
mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model", signature=signature)
# Optional - use to register the model
model_uri = f"runs:/{run.info.run_id}/[YOUR-EXPERIMENT-NAME]"
mv = mlflow.register_model(model_uri, "[YOUR-MODEL-NAME]")

Then, you can open MLFlow by clicking on Build in the top menu bar, then MLFlow, choose your tracking server (if you have more than one), and finally select the experiment you’ve created. MLFlow UI will open up in another tab.

Accessing MLFlow in SageMaker Unified Studio

Happy coding! 🚀 If this content has been useful, please leave a clap 👏 or a comment 🗯. This will let us know that our work has been appreciated! 😄

--

--

Davide Gallitelli
Davide Gallitelli

Written by Davide Gallitelli

A young Data Engineer, tech passionate and proud geek.

No responses yet