Stable Diffusion Webui Async Inference Sagemaker Studio
Generative Fill example on Amazon SageMaker using DLC container.
In this notebook, we explore how to build generative fill application and host Stable Diffusion/ ControlNet / segment anything models on SageMaker asynchronous endpoint using BYOC (Bring-your-own-container).
In this notebook, under the hood we use stable-diffusion-webui and extensions to generate image.
Note - Amazon Web Services has no control or authority over the third-party generative AI service referenced in this Workshop, and does not make any representations or warranties that the third-party generative AI service is secure, virus-free, operational, or compatible with your production environment and standards. You are responsible for making your own independent assessment of the content provided in this Workshop, and take measures to ensure that you comply with your own specific quality control practices and standards, and the local rules, laws, regulations, licenses and terms of use that apply to you, your content, and the third-party generative AI service referenced in this Workshop. The content of this Workshop: (a) is for informational purposes only, (b) represents current Amazon Web Services product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from Beijing Sinnet Technology Co., Ltd. (“Sinnet”), Ningxia Western Cloud Data Technology Co., Ltd. (“NWCD”), Amazon Connect Technology Services (Beijing) Co., Ltd. (“Amazon”), or their respective affiliates, suppliers or licensors. Amazon Web Services’ content, products or services are provided “as is” without warranties, representations, or conditions of any kind, whether express or implied. The responsibilities and liabilities of Sinnet, NWCD or Amazon to their respective customers are controlled by the applicable customer agreements.
This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.
Build Docker image and push to ECR.
Initialize the variables for SageMaker default bucket, role, and AWS account ID, and current AWS region.
Execute the script to build Docker images for SageMaker endpoint.
Upload the dummy file to S3 to meet the requirement of SageMaker Endpoint for model data.
Deploy to SageMaker Asychronous Endpoint
Initialized the variables for URI of Docker Inference Endpoint.
Define the models configuration in order to download those models from one of source - HTTP, S3 and HuggingFace. Note: Here as an example the Lora model - 2bNierAutomataLora_v2b.safetensors and ControlNet model - control_sd15_canny.pth are going to be downloaded from Civitai and Huggingface directly once the SageMaker endpoint is created.
Define the model, instance type and instance initial count for SageMaker endpoint.
Define the SageMaker Asychronous Inference config
Here we use asynchronous inference since asynchronous inference is more suitable for workloads with large payload sizes and long inference processing times.
Generate initial image using text prompt
Helper function for S3.
Wait until the asychronous inference is done in case we use asynchronous inference for image generation.
Process the generated images from asynchronous inference result.
Expand initial image using text prompt and ControlNet models
ControlNet is a neural network structure to control diffusion models by adding extra conditions.
Define the payload for SageMaker inference.
Wait until the asynchronous inference is done in case we use asynchronous inference for image generation.
Process the generated images from asynchronous inference result.
Run generative fill application built with Gradio framework
[Optional] Create auto-scaling group for SageMaker endpoint in case you want to scale it based on specific metrics automatically.
Resource cleanup.
Notebook CI Test Results
This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.