Notebooks
H
Hugging Face
Deploy Transformer Model From Hf Hub

Deploy Transformer Model From Hf Hub

11_deploy_model_from_hf_hubhf-notebookssagemaker

Huggingface Sagemaker-sdk - Deploy 🤗 Transformers for inference

Welcome to this getting started guide, we will use the new Hugging Face Inference DLCs and Amazon SageMaker Python SDK to deploy a transformer model for inference.
In this example we directly deploy one of the 10 000+ Hugging Face Transformers from the Hub to Amazon SageMaker for Inference.

Using the transformers pipelines, we designed an API, which makes it easy for you to benefit from all pipelines features. The API is oriented at the API of the 🤗 Accelerated Inference API, meaning your inputs need to be defined in the inputs key and if you want additional supported pipelines parameters you can add them in the parameters key. Below you can find examples for requests.

text-classification request body

{
	"inputs": "Camera - You are awarded a SiPix Digital Camera! call 09061221066 fromm landline. Delivery within 28 days."
}

question-answering request body

{
	"inputs": {
		"question": "What is used for inference?",
		"context": "My Name is Philipp and I live in Nuremberg. This model is used with sagemaker for inference."
	}
}

zero-shot classification request body

{
	"inputs": "Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!",
	"parameters": {
		"candidate_labels": [
			"refund",
			"legal",
			"faq"
		]
	}
}
[ ]

Deploy one of the 10 000+ Hugging Face Transformers to Amazon SageMaker for Inference

This is an experimental feature, where the model will be loaded after the endpoint is created. This could lead to errors, e.g. models > 10GB

To deploy a model directly from the Hub to SageMaker we need to define 2 environment variables when creating the HuggingFaceModel . We need to define:

  • HF_MODEL_ID: defines the model id, which will be automatically loaded from huggingface.co/models when creating or SageMaker Endpoint. The 🤗 Hub provides +10 000 models all available through this environment variable.
  • HF_TASK: defines the task for the used 🤗 Transformers pipeline. A full list of tasks can be find here.
[1]
sagemaker role arn: arn:aws:iam::558105141721:role/sagemaker_execution_role
[2]
[3]
----!
[4]
{'score': 0.9987204670906067, 'start': 68, 'end': 77, 'answer': 'sagemaker'}
[5]
[ ]