Stable Diffusion v2.1 using OpenVINO TorchDynamo backend#
This Jupyter notebook can be launched after a local installation only.
Stable Diffusion v2 is the next generation of Stable Diffusion model a Text-to-Image latent diffusion model created by the researchers and engineers from Stability AI and LAION.
General diffusion models are machine learning systems that are trained to denoise random gaussian noise step by step, to get to a sample of interest, such as an image. Diffusion models have shown to achieve state-of-the-art results for generating image data. But one downside of diffusion models is that the reverse denoising process is slow. In addition, these models consume a lot of memory because they operate in pixel space, which becomes unreasonably expensive when generating high-resolution images. Therefore, it is challenging to train these models and also use them for inference. OpenVINO brings capabilities to run model inference on Intel hardware and opens the door to the fantastic world of diffusion models for everyone!
This notebook demonstrates how to run stable diffusion model using Diffusers library and OpenVINO TorchDynamo backend for Text-to-Image and Image-to-Image generation tasks.
Notebook contains the following steps:
Create pipeline with PyTorch models.
Add OpenVINO optimization using OpenVINO TorchDynamo backend.
Run Stable Diffusion pipeline with OpenVINO.
Table of contents:
Installation Instructions#
This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start. For details, please refer to Installation Guide.
Prerequisites#
%pip install -q "torch>=2.2" transformers diffusers "gradio>=4.19" ipywidgets --extra-index-url https://download.pytorch.org/whl/cpu
%pip install -q "openvino>=2024.1.0"
import torch
from diffusers import StableDiffusionPipeline
Stable Diffusion with Diffusers library#
To work with Stable Diffusion v2.1, we will use Hugging Face Diffusers
library. To experiment with Stable Diffusion models, Diffusers exposes
the
StableDiffusionPipeline
and
StableDiffusionImg2ImgPipeline
similar to the other Diffusers
pipelines.
The code below demonstrates how to create the
StableDiffusionPipeline
using stable-diffusion-2-1-base
model:
model_id = "stabilityai/stable-diffusion-2-1-base"
# Pipeline for text-to-image generation
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float32)
Loading pipeline components...: 0%| | 0/6 [00:00<?, ?it/s]
OpenVINO TorchDynamo backend#
The OpenVINO TorchDynamo backend lets you enable OpenVINO support for PyTorch models with minimal changes to the original PyTorch script. It speeds up PyTorch code by JIT-compiling it into optimized kernels. By default, Torch code runs in eager-mode, but with the use of torch.compile it goes through the following steps:
Graph acquisition - the model is rewritten as blocks of subgraphs that are either:
compiled by TorchDynamo and “flattened”,
falling back to the eager-mode, due to unsupported Python constructs (like control-flow code).
Graph lowering - all PyTorch operations are decomposed into their constituent kernels specific to the chosen backend.
Graph compilation - the kernels call their corresponding low-level device-specific operations.
Select device for inference and enable or disable saving the optimized model files to a hard drive, after the first application run. This makes them available for the following application executions, reducing the first-inference latency. Read more about available Environment Variables options
import requests
r = requests.get(
url="https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/utils/notebook_utils.py",
)
open("notebook_utils.py", "w").write(r.text)
from notebook_utils import device_widget
device = device_widget()
device
Dropdown(description='Device:', options=('CPU', 'GPU.0', 'GPU.1', 'AUTO'), value='CPU')
import ipywidgets as widgets
model_caching = widgets.Dropdown(
options=[True, False],
value=True,
description="Model caching:",
disabled=False,
)
model_caching
Dropdown(description='Model caching:', options=(True, False), value=True)
To use torch.compile() method, you just need to add an import statement and define the OpenVINO backend:
# this import is required to activate the openvino backend for torchdynamo
import openvino.torch # noqa: F401
pipe.unet = torch.compile(
pipe.unet,
backend="openvino",
options={"device": device.value, "model_caching": model_caching.value},
)
**Note**: Read more about available `OpenVINO
backends <https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html#how-to-use>`__
Run Image generation#
prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt).images[0]
image
0%| | 0/50 [00:00<?, ?it/s]
Interactive demo#
Now you can start the demo, choose the inference mode, define prompts (and input image for Image-to-Image generation) and run inference pipeline. Optionally, you can also change some input parameters.
import requests
from pathlib import Path
if not Path("gradio_helper.py").exists():
r = requests.get(
url="https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/notebooks/stable-diffusion-torchdynamo-backend/gradio_helper.py"
)
open("gradio_helper.py", "w").write(r.text)
from gradio_helper import make_demo
demo = make_demo(model_id)
try:
demo.queue().launch(debug=True)
except Exception:
demo.queue().launch(share=True, debug=True)
# if you are launching remotely, specify server_name and server_port
# demo.launch(server_name='your server name', server_port='server port in int')
# Read more in the docs: https://gradio.app/docs/
Automatic1111 Stable Diffusion WebUI is an open-source repository that hosts a browser-based interface for the Stable Diffusion based image generation. It allows users to create realistic and creative images from text prompts. Stable Diffusion WebUI is supported on Intel CPUs, Intel integrated GPUs, and Intel discrete GPUs by leveraging OpenVINO torch.compile capability. Detailed instructions are available inStable Diffusion WebUI repository.