Colorize grayscale images using DDColor and OpenVINO#
This Jupyter notebook can be launched after a local installation only.
Image colorization is the process of adding color to grayscale images. Initially captured in black and white, these images are transformed into vibrant, lifelike representations by estimating RGB colors. This technology enhances both aesthetic appeal and perceptual quality. Historically, artists manually applied colors to monochromatic photographs, a painstaking task that could take up to a month for a single image. However, with advancements in information technology and the rise of deep neural networks, automated image colorization has become increasingly important.
DDColor is one of the most progressive methods of image colorization in
our days. It is a novel approach using dual decoders: a pixel decoder
and a query-based color decoder, that stands out in its ability to
produce photo-realistic colorization, particularly in complex scenes
with multiple objects and diverse contexts.
More details about this approach can be found in original model repository and paper.
In this tutorial we consider how to convert and run DDColor using OpenVINO. Additionally, we will demonstrate how to optimize this model using NNCF.
🪄 Let’s start to explore magic of image colorization!
Table of contents:
Installation Instructions#
This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start. For details, please refer to Installation Guide.
import platform
if platform.system() == "Darwin":
%pip install -q "numpy<2.0.0"
%pip install -q "nncf>=2.11.0" "torch>=2.1" "torchvision" "timm" "opencv_python" "pillow" "PyYAML" "scipy" "scikit-image" "datasets" "gradio>=4.19" --extra-index-url
%pip install -Uq "openvino>=2024.3.0"
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.13.1 requires typing-extensions<4.6.0,>=3.6.6, but you have typing-extensions 4.12.2 which is incompatible.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
from pathlib import Path
import requests
if not Path("").exists():
r = requests.get(
open("", "w").write(r.text)
if not Path("").exists():
r = requests.get(
open("", "w").write(r.text)
# Read more about telemetry collection at
from notebook_utils import collect_telemetry
from cmd_helper import clone_repo
if Path("DDColor/inference").exists():
from inference.colorization_pipeline_hf import DDColorHF, ImageColorizationPipelineHF
except Exception:
from inference.colorization_pipeline_hf import DDColorHF, ImageColorizationPipelineHF
from infer_hf import DDColorHF, ImageColorizationPipelineHF
except Exception:
from infer_hf import DDColorHF, ImageColorizationPipelineHF
/opt/home/k8sworker/ci-ai/cibuilds/jobs/ov-notebook/jobs/OVNotebookOps/builds/875/archive/.workspace/scm/ov-notebook/.venv/lib/python3.8/site-packages/timm/models/layers/ FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning)
Load PyTorch model#
There are several models from DDColor’s family provided in model repository. We will use DDColor-T, the most lightweight version of ddcolor model, but demonstrated in the tutorial steps are also applicable to other models from DDColor family.
import torch
model_name = "ddcolor_paper_tiny"
ddcolor_model = DDColorHF.from_pretrained(f"piddnad/{model_name}")
colorizer = ImageColorizationPipelineHF(model=ddcolor_model, input_size=512)"cpu")
colorizer.device = torch.device("cpu")
Run PyTorch model inference#
import cv2
import PIL
IMG_PATH = "DDColor/assets/test_images/Ansel Adams _ Moore Photography.jpeg"
img = cv2.imread(IMG_PATH)
PIL.Image.fromarray(img[:, :, ::-1])

image_out = colorizer.process(img)
PIL.Image.fromarray(image_out[:, :, ::-1])

Convert PyTorch model to OpenVINO Intermediate Representation#
OpenVINO supports PyTorch models via conversion to OpenVINO Intermediate
Representation (IR). OpenVINO model conversion API should be used for
these purposes. ov.convert_model
function accepts original PyTorch
model instance and example input for tracing and returns ov.Model
representing this model in OpenVINO framework. Converted model can be
used for saving on disk using ov.save_model
function or directly
loading on device using core.complie_model
import openvino as ov
import torch
OV_COLORIZER_PATH = Path("ddcolor.xml")
if not OV_COLORIZER_PATH.exists():
ov_model = ov.convert_model(ddcolor_model, example_input=torch.ones((1, 3, 512, 512)), input=[1, 3, 512, 512])
ov.save_model(ov_model, OV_COLORIZER_PATH)
Run OpenVINO model inference#
Select one of supported devices for inference using dropdown list.
from notebook_utils import device_widget
core = ov.Core()
device = device_widget()
Dropdown(description='Device:', index=1, options=('CPU', 'AUTO'), value='AUTO')
compiled_model = core.compile_model(OV_COLORIZER_PATH, device.value)
import cv2
import numpy as np
import torch
import torch.nn.functional as F
def process(img, compiled_model):
# Preprocess input image
height, width = img.shape[:2]
# Normalize to [0, 1] range
img = (img / 255.0).astype(np.float32)
orig_l = cv2.cvtColor(img, cv2.COLOR_BGR2Lab)[:, :, :1] # (h, w, 1)
# Resize rgb image -> lab -> get grey -> rgb
img = cv2.resize(img, (512, 512))
img_l = cv2.cvtColor(img, cv2.COLOR_BGR2Lab)[:, :, :1]
img_gray_lab = np.concatenate((img_l, np.zeros_like(img_l), np.zeros_like(img_l)), axis=-1)
img_gray_rgb = cv2.cvtColor(img_gray_lab, cv2.COLOR_LAB2RGB)
# Transpose HWC -> CHW and add batch dimension
tensor_gray_rgb = torch.from_numpy(img_gray_rgb.transpose((2, 0, 1))).float().unsqueeze(0)
# Run model inference
output_ab = compiled_model(tensor_gray_rgb)[0]
# Postprocess result
# resize ab -> concat original l -> rgb
output_ab_resize = F.interpolate(torch.from_numpy(output_ab), size=(height, width))[0].float().numpy().transpose(1, 2, 0)
output_lab = np.concatenate((orig_l, output_ab_resize), axis=-1)
output_bgr = cv2.cvtColor(output_lab, cv2.COLOR_LAB2BGR)
output_img = (output_bgr * 255.0).round().astype(np.uint8)
return output_img
ov_processed_img = process(img, compiled_model)
PIL.Image.fromarray(ov_processed_img[:, :, ::-1])

Optimize OpenVINO model using NNCF#
NNCF enables
post-training quantization by adding quantization layers into model
graph and then using a subset of the training dataset to initialize the
parameters of these additional quantization layers. Quantized operations
are executed in INT8
instead of FP32
making model
inference faster.
The optimization process contains the following steps:
Create a calibration dataset for quantization.
to obtain quantized model.Save the
model usingopenvino.save_model()
Please select below whether you would like to run quantization to improve model inference speed.
from notebook_utils import quantization_widget
to_quantize = quantization_widget()
Checkbox(value=True, description='Quantization')
import requests
OV_INT8_COLORIZER_PATH = Path("ddcolor_int8.xml")
compiled_int8_model = None
if not Path("").exists():
r = requests.get(
open("", "w").write(r.text)
%load_ext skip_kernel_extension
Collect quantization dataset#
We use a portion of ummagumm-a/colorization_dataset dataset from Hugging Face as calibration data.
%%skip not $to_quantize.value
from datasets import load_dataset
subset_size = 300
calibration_data = []
if not OV_INT8_COLORIZER_PATH.exists():
dataset = load_dataset("ummagumm-a/colorization_dataset", split="train", streaming=True).shuffle(seed=42).take(subset_size)
for idx, batch in enumerate(dataset):
if idx >= subset_size:
img = np.array(batch["conditioning_image"])
img = (img / 255.0).astype(np.float32)
img = cv2.resize(img, (512, 512))
img_l = cv2.cvtColor(np.stack([img, img, img], axis=2), cv2.COLOR_BGR2Lab)[:, :, :1]
img_gray_lab = np.concatenate((img_l, np.zeros_like(img_l), np.zeros_like(img_l)), axis=-1)
img_gray_rgb = cv2.cvtColor(img_gray_lab, cv2.COLOR_LAB2RGB)
image = np.expand_dims(img_gray_rgb.transpose((2, 0, 1)).astype(np.float32), axis=0)
Perform model quantization#
%%skip not $to_quantize.value
import nncf
if not OV_INT8_COLORIZER_PATH.exists():
ov_model = core.read_model(OV_COLORIZER_PATH)
quantized_model = nncf.quantize(
ov.save_model(quantized_model, OV_INT8_COLORIZER_PATH)
INFO:nncf:NNCF initialized successfully. Supported frameworks detected: torch, tensorflow, onnx, openvino
2025-02-03 23:55:05.499380: I tensorflow/core/util/] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2025-02-03 23:55:05.538709: I tensorflow/core/platform/] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2025-02-03 23:55:05.951653: W tensorflow/compiler/tf2tensorrt/utils/] TF-TRT Warning: Could not find TensorRT
Run INT8 model inference#
from IPython.display import display
compiled_int8_model = core.compile_model(OV_INT8_COLORIZER_PATH, device.value)
img = cv2.imread("DDColor/assets/test_images/Ansel Adams _ Moore Photography.jpeg")
img_out = process(img, compiled_int8_model)
display(PIL.Image.fromarray(img_out[:, :, ::-1]))

Compare FP16 and INT8 model size#
fp16_ir_model_size = OV_COLORIZER_PATH.with_suffix(".bin").stat().st_size / 2**20
print(f"FP16 model size: {fp16_ir_model_size:.2f} MB")
quantized_model_size = OV_INT8_COLORIZER_PATH.with_suffix(".bin").stat().st_size / 2**20
print(f"INT8 model size: {quantized_model_size:.2f} MB")
print(f"Model compression rate: {fp16_ir_model_size / quantized_model_size:.3f}")
FP16 model size: 104.89 MB
INT8 model size: 52.97 MB
Model compression rate: 1.980
Compare inference time of the FP16 and INT8 models#
To measure the inference performance of OpenVINO FP16 and INT8 models, use Benchmark Tool.
NOTE: For the most accurate performance estimation, it is recommended to run
in a terminal/command prompt after closing other applications.
!benchmark_app -m $OV_COLORIZER_PATH -d $device.value -api async -shape "[1,3,512,512]" -t 15
[Step 1/11] Parsing and validating input arguments [ INFO ] Parsing input parameters [Step 2/11] Loading OpenVINO Runtime [ INFO ] OpenVINO: [ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4 [ INFO ] [ INFO ] Device info: [ INFO ] AUTO [ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4 [ INFO ] [ INFO ] [Step 3/11] Setting device configuration [ WARNING ] Performance hint was not explicitly specified in command line. Device(AUTO) performance hint will be set to PerformanceMode.THROUGHPUT. [Step 4/11] Reading model files [ INFO ] Loading model files [ INFO ] Read model took 42.87 ms [ INFO ] Original model I/O parameters: [ INFO ] Model inputs: [ INFO ] x (node: x) : f32 / [...] / [1,3,512,512] [ INFO ] Model outputs: [ INFO ] *NO_NAME* (node: __module.refine_net.0.0/aten::_convolution/Add) : f32 / [...] / [1,2,512,512] [Step 5/11] Resizing model to match image sizes and given batch [ INFO ] Model batch size: 1 [ INFO ] Reshaping model: 'x': [1,3,512,512] [ INFO ] Reshape model took 0.04 ms [Step 6/11] Configuring input of the model [ INFO ] Model inputs: [ INFO ] x (node: x) : u8 / [N,C,H,W] / [1,3,512,512] [ INFO ] Model outputs: [ INFO ] *NO_NAME* (node: __module.refine_net.0.0/aten::_convolution/Add) : f32 / [...] / [1,2,512,512] [Step 7/11] Loading the model to the device [ INFO ] Compile model took 1292.95 ms [Step 8/11] Querying optimal runtime parameters [ INFO ] Model: [ INFO ] NETWORK_NAME: Model0 [ INFO ] EXECUTION_DEVICES: ['CPU'] [ INFO ] PERFORMANCE_HINT: PerformanceMode.THROUGHPUT [ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 6 [ INFO ] MULTI_DEVICE_PRIORITIES: CPU [ INFO ] CPU: [ INFO ] AFFINITY: Affinity.CORE [ INFO ] CPU_DENORMALS_OPTIMIZATION: False [ INFO ] CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0 [ INFO ] DYNAMIC_QUANTIZATION_GROUP_SIZE: 32 [ INFO ] ENABLE_CPU_PINNING: True [ INFO ] ENABLE_HYPER_THREADING: True [ INFO ] EXECUTION_DEVICES: ['CPU'] [ INFO ] EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE [ INFO ] INFERENCE_NUM_THREADS: 24 [ INFO ] INFERENCE_PRECISION_HINT: <Type: 'float32'> [ INFO ] KV_CACHE_PRECISION: <Type: 'float16'> [ INFO ] LOG_LEVEL: Level.NO [ INFO ] MODEL_DISTRIBUTION_POLICY: set() [ INFO ] NETWORK_NAME: Model0 [ INFO ] NUM_STREAMS: 6 [ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 6 [ INFO ] PERFORMANCE_HINT: THROUGHPUT [ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0 [ INFO ] PERF_COUNT: NO [ INFO ] SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE [ INFO ] MODEL_PRIORITY: Priority.MEDIUM [ INFO ] LOADED_FROM_CACHE: False [ INFO ] PERF_COUNT: False [Step 9/11] Creating infer requests and preparing input tensors [ WARNING ] No input files were given for input 'x'!. This input will be filled with random values! [ INFO ] Fill input 'x' with random values [Step 10/11] Measuring performance (Start inference asynchronously, 6 inference requests, limits: 15000 ms duration) [ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop). [ INFO ] First inference took 546.86 ms [Step 11/11] Dumping statistics report [ INFO ] Execution Devices:['CPU'] [ INFO ] Count: 72 iterations [ INFO ] Duration: 16123.94 ms [ INFO ] Latency: [ INFO ] Median: 1337.94 ms [ INFO ] Average: 1322.61 ms [ INFO ] Min: 901.39 ms [ INFO ] Max: 1725.13 ms [ INFO ] Throughput: 4.47 FPS
!benchmark_app -m $OV_INT8_COLORIZER_PATH -d $device.value -api async -shape "[1,3,512,512]" -t 15
[Step 1/11] Parsing and validating input arguments [ INFO ] Parsing input parameters [Step 2/11] Loading OpenVINO Runtime [ INFO ] OpenVINO: [ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4 [ INFO ] [ INFO ] Device info: [ INFO ] AUTO [ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4 [ INFO ] [ INFO ] [Step 3/11] Setting device configuration [ WARNING ] Performance hint was not explicitly specified in command line. Device(AUTO) performance hint will be set to PerformanceMode.THROUGHPUT. [Step 4/11] Reading model files [ INFO ] Loading model files [ INFO ] Read model took 69.28 ms [ INFO ] Original model I/O parameters: [ INFO ] Model inputs: [ INFO ] x (node: x) : f32 / [...] / [1,3,512,512] [ INFO ] Model outputs: [ INFO ] *NO_NAME* (node: __module.refine_net.0.0/aten::_convolution/Add) : f32 / [...] / [1,2,512,512] [Step 5/11] Resizing model to match image sizes and given batch [ INFO ] Model batch size: 1 [ INFO ] Reshaping model: 'x': [1,3,512,512] [ INFO ] Reshape model took 0.04 ms [Step 6/11] Configuring input of the model [ INFO ] Model inputs: [ INFO ] x (node: x) : u8 / [N,C,H,W] / [1,3,512,512] [ INFO ] Model outputs: [ INFO ] *NO_NAME* (node: __module.refine_net.0.0/aten::_convolution/Add) : f32 / [...] / [1,2,512,512] [Step 7/11] Loading the model to the device [ INFO ] Compile model took 2213.17 ms [Step 8/11] Querying optimal runtime parameters [ INFO ] Model: [ INFO ] NETWORK_NAME: Model0 [ INFO ] EXECUTION_DEVICES: ['CPU'] [ INFO ] PERFORMANCE_HINT: PerformanceMode.THROUGHPUT [ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 6 [ INFO ] MULTI_DEVICE_PRIORITIES: CPU [ INFO ] CPU: [ INFO ] AFFINITY: Affinity.CORE [ INFO ] CPU_DENORMALS_OPTIMIZATION: False [ INFO ] CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0 [ INFO ] DYNAMIC_QUANTIZATION_GROUP_SIZE: 32 [ INFO ] ENABLE_CPU_PINNING: True [ INFO ] ENABLE_HYPER_THREADING: True [ INFO ] EXECUTION_DEVICES: ['CPU'] [ INFO ] EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE [ INFO ] INFERENCE_NUM_THREADS: 24 [ INFO ] INFERENCE_PRECISION_HINT: <Type: 'float32'> [ INFO ] KV_CACHE_PRECISION: <Type: 'float16'> [ INFO ] LOG_LEVEL: Level.NO [ INFO ] MODEL_DISTRIBUTION_POLICY: set() [ INFO ] NETWORK_NAME: Model0 [ INFO ] NUM_STREAMS: 6 [ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 6 [ INFO ] PERFORMANCE_HINT: THROUGHPUT [ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0 [ INFO ] PERF_COUNT: NO [ INFO ] SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE [ INFO ] MODEL_PRIORITY: Priority.MEDIUM [ INFO ] LOADED_FROM_CACHE: False [ INFO ] PERF_COUNT: False [Step 9/11] Creating infer requests and preparing input tensors [ WARNING ] No input files were given for input 'x'!. This input will be filled with random values! [ INFO ] Fill input 'x' with random values [Step 10/11] Measuring performance (Start inference asynchronously, 6 inference requests, limits: 15000 ms duration) [ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop). [ INFO ] First inference took 275.90 ms [Step 11/11] Dumping statistics report [ INFO ] Execution Devices:['CPU'] [ INFO ] Count: 156 iterations [ INFO ] Duration: 15504.26 ms [ INFO ] Latency: [ INFO ] Median: 592.20 ms [ INFO ] Average: 590.64 ms [ INFO ] Min: 357.21 ms [ INFO ] Max: 685.31 ms [ INFO ] Throughput: 10.06 FPS
Interactive inference#
def generate(image, use_int8=True):
image_in = cv2.imread(image)
image_out = process(image_in, compiled_model if not use_int8 else compiled_int8_model)
image_out_pil = PIL.Image.fromarray(cv2.cvtColor(image_out, cv2.COLOR_BGR2RGB))
return image_out_pil
if not Path("").exists():
r = requests.get(url="")
open("", "w").write(r.text)
from gradio_helper import make_demo
demo = make_demo(fn=generate, quantized=compiled_int8_model is not None)
except Exception:
demo.queue().launch(share=True, debug=False)
# if you are launching remotely, specify server_name and server_port
# demo.launch(server_name='your server name', server_port='server port in int')
# Read more in the docs:
Running on local URL: To create a public link, set share=True in launch().