Logging Api
![]()
Copyright (c) Microsoft Corporation. All rights reserved.
Licensed under the MIT License.
Logging
This notebook showcases various ways to use the Azure Machine Learning service run logging APIs, and view the results in the Azure portal.
Table of Contents
- Introduction
- Setup
- Validate Azure ML SDK installation
- Initialize workspace
- Set experiment
- Logging
- Starting a run
- Viewing a run in the portal
- Viewing the experiment in the portal
- Logging metrics
- Logging string metrics
- Logging numeric metrics
- Logging vectors
- Logging tables
- Logging when additional Metric Names are required
- Uploading files
- Starting a run
- Analyzing results
- Tagging a run
- Next steps
Introduction
Logging metrics from runs in your experiments allows you to track results from one run to another, determining trends in your outputs and understand how your inputs correspond to your model and script performance. Azure Machine Learning services (AzureML) allows you to track various types of metrics including images and arbitrary files in order to understand, analyze, and audit your experimental progress.
Typically you should log all parameters for your experiment and all numerical and string outputs of your experiment. This will allow you to analyze the performance of your experiments across multiple runs, correlate inputs to outputs, and filter runs based on interesting criteria.
The experiment's Run History report page automatically creates a report that can be customized to show the KPI's, charts, and column sets that are interesting to you.
| Run Details | Run History |
Setup
If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, go through the configuration Notebook first if you haven't already to establish your connection to the AzureML Workspace. Also make sure you have tqdm and matplotlib installed in the current kernel.
(myenv) $ conda install -y tqdm matplotlib
Validate Azure ML SDK installation and get version number for debugging purposes
Initialize workspace
Initialize a workspace object from persisted configuration.
Set experiment
Create a new experiment (or get the one with the specified name). An experiment is a container for an arbitrary set of runs.
Logging
In this section we will explore the various logging mechanisms.
Starting a run
A run is a singular experimental trial. In this notebook we will create a run directly on the experiment by calling run = exp.start_logging(). If you were experimenting by submitting a script file as an experiment using experiment.submit(), you would call run = Run.get_context() in your script to access the run context of your code. In either case, the logging methods on the returned run object work the same.
This cell also stores the run id for use later in this notebook. The run_id is not necessary for logging.
Viewing a run in the Portal
Once a run is started you can see the run in the portal by simply typing run. Clicking on the "Link to Portal" link will take you to the Run Details page that shows the metrics you have logged and other run properties. You can refresh this page after each logging statement to see the updated results.
Viewing an experiment in the portal
You can also view an experiement similarly by typing experiment. The portal link will take you to the experiment's Run History page that shows all runs and allows you to analyze trends across multiple runs.
Logging metrics
Metrics are visible in the run details page in the AzureML portal and also can be analyzed in experiment reports. The run details page looks as below and contains tabs for Details, Outputs, Logs, and Snapshot.
- The Details page displays attributes about the run, plus logged metrics and images. Metrics that are vectors appear as charts.
- The Outputs page contains any files, such as models, you uploaded into the "outputs" directory from your run into storage. If you place files in the "outputs" directory locally, the files are automatically uploaded on your behald when the run is completed.
- The Logs page allows you to view any log files created by your run. Logging runs created in notebooks typically do not generate log files.
- The Snapshot page contains a snapshot of the directory specified in the ''start_logging'' statement, plus the notebook at the time of the ''start_logging'' call. This snapshot and notebook can be downloaded from the Run Details page to continue or reproduce an experiment.
Logging string metrics
The following cell logs a string metric. A string metric is simply a string value associated with a name. A string metric String metrics are useful for labelling runs and to organize your data. Typically you should log all string parameters as metrics for later analysis - even information such as paths can help to understand how individual experiements perform differently.
String metrics can be used in the following ways:
- Plot in hitograms
- Group by indicators for numerical plots
- Filtering runs
String metrics appear in the Tracked Metrics section of the Run Details page and can be added as a column in Run History reports.
Logging numerical metrics
The following cell logs some numerical metrics. Numerical metrics can include metrics such as AUC or MSE. You should log any parameter or significant output measure in order to understand trends across multiple experiments. Numerical metrics appear in the Tracked Metrics section of the Run Details page, and can be used in charts or KPI's in experiment Run History reports.
Logging vectors
Vectors are good for recording information such as loss curves. You can log a vector by creating a list of numbers, calling log_list() and supplying a name and the list, or by repeatedly logging a value using the same name.
Vectors are presented in Run Details as a chart, and are directly comparable in experiment reports when placed in a chart.
Note: vectors logged into the run are expected to be relatively small. Logging very large vectors into Azure ML can result in reduced performance. If you need to store large amounts of data associated with the run, you can write the data to file that will be uploaded.
Logging tables
Tables are good for recording related sets of information such as accuracy tables, confusion matrices, etc.
You can log a table in two ways:
- Create a dictionary of lists where each list represents a column in the table and call
log_table() - Repeatedly call
log_row()providing the same table name with a consistent set of named args as the column values
Tables are presented in Run Details as a chart using the first two columns of the table
Note: tables logged into the run are expected to be relatively small. Logging very large tables into Azure ML can result in reduced performance. If you need to store large amounts of data associated with the run, you can write the data to file that will be uploaded.
Logging images
You can directly log matplotlib plots and arbitrary images to your run record. This code logs a matplotlib pyplot object. Images show up in the run details page in the Azure ML Portal.
Logging when additional Metric Names are required
Limits on logging are internally enforced to ensure a smooth experience, however these can sometimes be limiting, particularly in terms of the limit on metric names.
The "Logging Vectors" or "Logging Tables" examples previously can be expanded upon to use up to 15 columns to increase this limit, with the information still being presented in Run Details as a chart, and being directly comparable in experiment reports.
Note: see Azure Machine Learning Limits Documentation for more information on service limits. Note: tables logged into the run are expected to be relatively small. Logging very large tables into Azure ML can result in reduced performance. If you need to store large amounts of data associated with the run, you can write the data to file that will be uploaded.
Uploading files
Files can also be uploaded explicitly and stored as artifacts along with the run record. These files are also visible in the Outputs tab of the Run Details page.
Completing the run
Calling run.complete() marks the run as completed and triggers the output file collection. If for any reason you need to indicate the run failed or simply need to cancel the run you can call run.fail() or run.cancel().
Analyzing results
You can refresh the run in the Azure portal to see all of your results. In many cases you will want to analyze runs that were performed previously to inspect the contents or compare results. Runs can be fetched from their parent Experiment object using the Run() constructor or the experiment.get_runs() method.
Call run.get_metrics() to retrieve all the metrics from a run.
Call run.get_metrics(name = <metric name>) to retrieve a metric value by name. Retrieving a single metric can be faster, especially if the run contains many metrics.
See the files uploaded for this run by calling run.get_file_names()
Once you know the file names in a run, you can download the files using the run.download_file() method
Tagging a run
Often when you analyze the results of a run, you may need to tag that run with important personal or external information. You can add a tag to a run using the run.tag() method. AzureML supports valueless and valued tags.
Next steps
To experiment more with logging and to understand how metrics can be visualized, go back to the Start a run section, try changing the category and scale_factor values and going through the notebook several times. Play with the KPI, charting, and column selection options on the experiment's Run History reports page to see how the various metrics can be combined and visualized.
After learning about all of the logging options, go to the train on remote vm notebook and experiment with logging from remote compute contexts.