112 Vertex Ai Vision
This notebook shows how to deploy a vision model from 🤗 Transformers (written in TensorFlow) to Vertex AI. This is beneficial in many ways:
- Vertex AI provides support for autoscaling, authorization, and authentication out of the box.
- One can maintain multiple versions of a model and can control the traffic split very easily.
- Purely serverless.
This notebook uses code from this official GCP example.
This tutorial uses the following billable components of Google Cloud:
- Vertex AI
- Cloud Storage
Learn about Vertex AI pricing and Cloud Storage pricing, and use the Pricing Calculator to generate a cost estimate based on your projected usage.
Initial setup
First authenticate yourself to provide Colab access to your GCP resources.
Creating gs://hf-tf-vision/...
Initial imports
2022-07-17 05:01:47.593465: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2.9.0-rc2 4.20.1
Save the model locally
We will work with a Vision Transformer B-16 model provided by 🤗 Transformers. We will first initialize it, load the model weights, and then save it locally as a SavedModel resource.
All model checkpoint layers were used when initializing TFViTForImageClassification. All the layers of TFViTForImageClassification were initialized from the model checkpoint at google/vit-base-patch16-224. If your task is similar to the task the model of the checkpoint was trained on, you can already use TFViTForImageClassification for predictions without further training. WARNING:absl:Found untraced functions such as embeddings_layer_call_fn, embeddings_layer_call_and_return_conditional_losses, encoder_layer_call_fn, encoder_layer_call_and_return_conditional_losses, layernorm_layer_call_fn while saving (showing 5 of 421). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: vit/saved_model/1/assets
INFO:tensorflow:Assets written to: vit/saved_model/1/assets
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['__saved_model_init_op']:
The given SavedModel SignatureDef contains the following input(s):
The given SavedModel SignatureDef contains the following output(s):
outputs['__saved_model_init_op'] tensor_info:
dtype: DT_INVALID
shape: unknown_rank
name: NoOp
Method name is:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['pixel_values'] tensor_info:
dtype: DT_FLOAT
shape: (-1, -1, -1, -1)
name: serving_default_pixel_values:0
The given SavedModel SignatureDef contains the following output(s):
outputs['logits'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1000)
name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict
Concrete Functions:
Function Name: '__call__'
Option #1
Callable with:
Argument #1
DType: dict
Value: {'pixel_values': TensorSpec(shape=(None, 3, 224, 224), dtype=tf.float32, name='pixel_values/pixel_values')}
Argument #2
DType: NoneType
Value: None
Argument #3
DType: NoneType
Value: None
Argument #4
DType: NoneType
Value: None
Argument #5
DType: NoneType
Value: None
Argument #6
DType: NoneType
Value: None
Argument #7
DType: NoneType
Value: None
Argument #8
DType: bool
Value: True
Option #2
Callable with:
Argument #1
DType: dict
Value: {'pixel_values': TensorSpec(shape=(None, 3, 224, 224), dtype=tf.float32, name='pixel_values/pixel_values')}
Argument #2
DType: NoneType
Value: None
Argument #3
DType: NoneType
Value: None
Argument #4
DType: NoneType
Value: None
Argument #5
DType: NoneType
Value: None
Argument #6
DType: NoneType
Value: None
Argument #7
DType: NoneType
Value: None
Argument #8
DType: bool
Value: False
Function Name: '_default_save_signature'
Option #1
Callable with:
Argument #1
DType: dict
Value: {'pixel_values': TensorSpec(shape=(None, 3, 224, 224), dtype=tf.float32, name='pixel_values')}
Function Name: 'call_and_return_all_conditional_losses'
Option #1
Callable with:
Argument #1
DType: dict
Value: {'pixel_values': TensorSpec(shape=(None, 3, 224, 224), dtype=tf.float32, name='pixel_values/pixel_values')}
Argument #2
DType: NoneType
Value: None
Argument #3
DType: NoneType
Value: None
Argument #4
DType: NoneType
Value: None
Argument #5
DType: NoneType
Value: None
Argument #6
DType: NoneType
Value: None
Argument #7
DType: NoneType
Value: None
Argument #8
DType: bool
Value: False
Option #2
Callable with:
Argument #1
DType: dict
Value: {'pixel_values': TensorSpec(shape=(None, 3, 224, 224), dtype=tf.float32, name='pixel_values/pixel_values')}
Argument #2
DType: NoneType
Value: None
Argument #3
DType: NoneType
Value: None
Argument #4
DType: NoneType
Value: None
Argument #5
DType: NoneType
Value: None
Argument #6
DType: NoneType
Value: None
Argument #7
DType: NoneType
Value: None
Argument #8
DType: bool
Value: True
Function Name: 'serving'
Option #1
Callable with:
Argument #1
DType: dict
Value: {'pixel_values': TensorSpec(shape=(None, None, None, None), dtype=tf.float32, name='pixel_values')}
Embedding pre and post processing ops inside the model
ML models usually require some pre and post processing of the input data and predicted results. So, it's a good idea to ship an ML model that already has these supports. It also helps in reducing training/serving skew.
For our model we need:
- Data normalization, resizing, and transposition as the preprocessing ops.
- Mapping the predicted logits to ImageNet-1k classes as the post-processing ops.
ViTFeatureExtractor {
, "do_normalize": true,
, "do_resize": true,
, "feature_extractor_type": "ViTFeatureExtractor",
, "image_mean": [
, 0.5,
, 0.5,
, 0.5
, ],
, "image_std": [
, 0.5,
, 0.5,
, 0.5
, ],
, "resample": 2,
, "size": 224
,} WARNING:tensorflow:From /opt/conda/lib/python3.7/site-packages/tensorflow/python/autograph/impl/api.py:458: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with back_prop=False is deprecated and will be removed in a future version. Instructions for updating: back_prop=False is deprecated. Consider using tf.stop_gradient instead. Instead of: results = tf.map_fn(fn, elems, back_prop=False) Use: results = tf.nest.map_structure(tf.stop_gradient, tf.map_fn(fn, elems))
WARNING:tensorflow:From /opt/conda/lib/python3.7/site-packages/tensorflow/python/autograph/impl/api.py:458: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with back_prop=False is deprecated and will be removed in a future version. Instructions for updating: back_prop=False is deprecated. Consider using tf.stop_gradient instead. Instead of: results = tf.map_fn(fn, elems, back_prop=False) Use: results = tf.nest.map_structure(tf.stop_gradient, tf.map_fn(fn, elems))
WARNING:tensorflow:From /opt/conda/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py:629: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with dtype is deprecated and will be removed in a future version. Instructions for updating: Use fn_output_signature instead
WARNING:tensorflow:From /opt/conda/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py:629: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with dtype is deprecated and will be removed in a future version. Instructions for updating: Use fn_output_signature instead WARNING:absl:Found untraced functions such as embeddings_layer_call_fn, embeddings_layer_call_and_return_conditional_losses, encoder_layer_call_fn, encoder_layer_call_and_return_conditional_losses, layernorm_layer_call_fn while saving (showing 5 of 421). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: gs://hf-tf-vision/vit/assets
INFO:tensorflow:Assets written to: gs://hf-tf-vision/vit/assets
Notes on making the model accept string inputs:
When dealing with images via REST or gRPC requests the size of the request payload can easily spiral up depending on the resolution of the images being passed. This is why, it is good practice to compress them reliably and then prepare the request payload.
Deployment on Vertex AI
This resource shows some relevant concepts on Vertex AI.
'projects/29880397572/locations/us-central1/models/7235960789184544768'
'projects/29880397572/locations/us-central1/endpoints/7116769330687115264'
deployed_model {
, id: "5163311002082607104"
,} Make a prediction request
Serving function input: string_input
predictions {
struct_value {
fields {
key: "confidence"
value {
number_value: 0.896659553
}
}
fields {
key: "label"
value {
string_value: "Egyptian cat"
}
}
}
}
deployed_model_id: "5163311002082607104"
model: "projects/29880397572/locations/us-central1/models/7235960789184544768"
model_display_name: "ViT Base TF2.8 GPU model"
Cleaning up of resources
running undeploy_model operation: projects/29880397572/locations/us-central1/endpoints/7116769330687115264/operations/6837774371172384768 running delete_endpoint operation: projects/29880397572/locations/us-central1/operations/7182299742666227712 running delete_model operation: projects/29880397572/locations/us-central1/operations/1269073431928766464
Removing gs://hf-tf-vision/vit/#1658034189039614... Removing gs://hf-tf-vision/vit/assets/#1658034196731689... Removing gs://hf-tf-vision/vit/saved_model.pb#1658034197598270... Removing gs://hf-tf-vision/vit/variables/#1658034189325867... / [4 objects] ==> NOTE: You are performing a sequence of gsutil operations that may run significantly faster if you instead use gsutil -m rm ... Please see the -m section under "gsutil help options" for further information about when gsutil -m can be advantageous. Removing gs://hf-tf-vision/vit/variables/variables.data-00000-of-00001#1658034195624888... Removing gs://hf-tf-vision/vit/variables/variables.index#1658034195904828... / [6 objects] Operation completed over 6 objects. Removing gs://hf-tf-vision/...