Notebooks
A
Amazon Web Services
Vilt B32 Finetuned Vqa

Vilt B32 Finetuned Vqa

data-scienceinferencearchivedamazon-sagemaker-examplesreinforcement-learningmachine-learningawsexamplesdeep-learningsagemakerjupyter-notebooktrainingmlops

Hugging Face Multimodel Inference (Visual question answering) with vilt-b32-finetuned-vqa


This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.

This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable


Overview

This notebook demonstrates how to deploy and run inference for Hugging Face Multimodal vilt-b32-finetuned-vqa for visual question answering on Amazon SageMaker.

Visual Question Answering (VQA) is a task where a model answers questions about an image. The input consists of an image and a textual question about the image. The output is the model's answer to the question, bridging the gap between computer vision and natural language understanding.

Vilt-b32-finetuned-vqa is a Vision-and-Language Transformer (ViLT) model fine-tuned on VQAv2. Please visit the model card on HuggingFace here for more information.

Setup

Install or update the SageMaker Python SDK

First, we need to make sure the latest version of the SageMaker Python SDK is installed.

[ ]

Setup Python Modules and roles

Then, we import the SageMaker python SDK and instantiate a sagemaker_session which we use to determine the current region and execution role.

[ ]

Create the Hugging Face model

Next we configure the HuggingFaceModel object by specifying a unique model name, transformers_version, pytorch_version, py_version, and the execution role for the endpoint. Additionally, we specify some environment variables including the HF_MODEL_ID which corresponds to the model in the HuggingFace Hub, and the HF_TASK which configures the inference task to be performed.

[ ]

Creating a SageMaker Endpoint

Next we deploy the model by invoking the deploy() function. Here we use a ml.m5.xlarge instance with 4 vCPUs and 16 GiB of memory.

[ ]

Run Inference

To run inference for visual question answering model, we first need to prepare the input for inference. The input consists of an image and a question (text string). The image can be stored in S3 and supplied through S3 presigned url.

Please replace BUCKET_NAME, IMAGE_NAME, QUESTION_INPUT with your input S3 bucket, image name, and question.

[ ]

Next we can call the Sagemaker endpoint we created in this notebook, and provide the image url and question for inference.

[ ]

Cleanup

After you've finished testing the endpoint, it's important to delete the model and endpoint resources to avoid incurring charges.

[ ]

Conclusion

In this tutorial, we deployed a Hugging Face Multimodal vilt-b32-finetuned-vqa to an Amazon SageMaker real-time endpoint.

With SageMaker Hosting, you can easily host Multimodal and run inference.

Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable

This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable