Notebooks
W
Weights and Biases
Ensembles In The W&B Model Registry

Ensembles In The W&B Model Registry

wandb-exampleswandb-model-registrycolabs

Open In Colab

Storing Ensemble Models with W&B Model Registry

This colab assumes you are familiar with Artifact, Collections, and the Model registry. Please see https://docs.wandb.ai/guides/models for details on such topics

ensemble - noun: "a group of items viewed as a whole rather than individually."

Ensemble Modeling is the class of solutions which have multiple sub-models which comprise a composite "model" of prediction. It is helpful to think of an Ensemble Model as a DAG (Directed Acyclical Graph) of models. A common pattern is to have individual engineers or teams working on different pieces of the composite, especially if each piece is complex. For example,

dataset_card_overview

Now, there are many considersations and challenges with training such models:

  • Tracking and versioning training data for child models generated by the parent models
  • Evaluating & optimizing sub-components in isolation
  • Evaluating & fine tuning the composite ensemble
  • Facilitating independent workflows for each piece

In the interest of keeping the code simple and runtimes quick, we will use dummy data, models, and training functions so that the reader can focus on the mechanics / workflow.

[ ]
[ ]
[ ]

Step 0: Create Registered Models

Before we train our models, we will want to create Registered Models for each of our sub models as well as one for the entire ensemble.

  • Navigate to the model registry and click "Create Registered Model".
  • Set the entity to the team you want this registered model to be visible to and the project to "model-registry". Do this for:
    • ensemble_m1
    • ensemble_m2
    • ensemble_m3
    • ensemble_all

See screenshots below: dataset_card_overview dataset_card_overview

Step 2: Train each piece of the ensemble and link them to the Registry

  • This can happen independently. The benefit of the registry is tracking the state of independent workstreams asynchronously.

Generate Data

[ ]

Train M1

[ ]

After the above two cells, W&B will have artifacts for both a dataset and a model, and the implicit DAG will begin to grow:

Screen Shot 2022-07-29 at 10.11.13 AM.png

Train M2

[ ]

Train M3

[ ]

Step 3: Assemble the Ensemble and link to Registry

Assemble the Ensemble

You can create composite artifacts by adding references to other W&B artifacts into a single artifact. This allows you to create complex dependency structures among model collections.

ref = wandb.Artifact.get_path(file) retrieves an internal reference uri to the W&B artifact. We can pass this reference around to other collections that might be using this model in some way.

In this example, load_obj_from_artifact gives us the model and its W&B reference uri, which we then add to the composite ensemble artifact.

def load_obj_from_artifact(run, name, file):
  art = run.use_artifact(name)
  ref = art.get_path(file)
  with open(ref.download(), 'r') as file:
    obj = json.load(file)
  return obj, ref
[ ]

Accessing Ensemble Pieces

Specific sub models from runs can be accessed with:

  • run.use("sub_model_collection_name-RUN_ID:ALIAS")

Specific sub models can be accessed with:

  • run.use("sub_model_collection_name:ALIAS")

Specific composite ensembles can be accessed with:

  • run.use("ensemble_name:ALIAS")

Finally, W&B reflects the underlying structure of your models:

Screen Shot 2022-07-29 at 11.15.31 AM.png