Ensembles In The W&B Model Registry
Storing Ensemble Models with W&B Model Registry
This colab assumes you are familiar with Artifact, Collections, and the Model registry. Please see https://docs.wandb.ai/guides/models for details on such topics
ensemble - noun: "a group of items viewed as a whole rather than individually."
Ensemble Modeling is the class of solutions which have multiple sub-models which comprise a composite "model" of prediction. It is helpful to think of an Ensemble Model as a DAG (Directed Acyclical Graph) of models. A common pattern is to have individual engineers or teams working on different pieces of the composite, especially if each piece is complex. For example,
Now, there are many considersations and challenges with training such models:
- Tracking and versioning training data for child models generated by the parent models
- Evaluating & optimizing sub-components in isolation
- Evaluating & fine tuning the composite ensemble
- Facilitating independent workflows for each piece
In the interest of keeping the code simple and runtimes quick, we will use dummy data, models, and training functions so that the reader can focus on the mechanics / workflow.
Step 0: Create Registered Models
Before we train our models, we will want to create Registered Models for each of our sub models as well as one for the entire ensemble.
- Navigate to the model registry and click "Create Registered Model".
- Set the entity to the team you want this registered model to be visible to and the project to "model-registry". Do this for:
- ensemble_m1
- ensemble_m2
- ensemble_m3
- ensemble_all
See screenshots below:
Step 2: Train each piece of the ensemble and link them to the Registry
- This can happen independently. The benefit of the registry is tracking the state of independent workstreams asynchronously.
Generate Data
Train M1
After the above two cells, W&B will have artifacts for both a dataset and a model, and the implicit DAG will begin to grow:
Train M2
Train M3
Step 3: Assemble the Ensemble and link to Registry
Assemble the Ensemble
You can create composite artifacts by adding references to other W&B artifacts into a single artifact. This allows you to create complex dependency structures among model collections.
ref = wandb.Artifact.get_path(file) retrieves an internal reference uri to the W&B artifact. We can pass this reference around to other collections that might be using this model in some way.
In this example, load_obj_from_artifact gives us the model and its W&B reference uri, which we then add to the composite ensemble artifact.
def load_obj_from_artifact(run, name, file):
art = run.use_artifact(name)
ref = art.get_path(file)
with open(ref.download(), 'r') as file:
obj = json.load(file)
return obj, ref
Accessing Ensemble Pieces
Specific sub models from runs can be accessed with:
run.use("sub_model_collection_name-RUN_ID:ALIAS")
Specific sub models can be accessed with:
run.use("sub_model_collection_name:ALIAS")
Specific composite ensembles can be accessed with:
run.use("ensemble_name:ALIAS")
Finally, W&B reflects the underlying structure of your models: