OpenVINO™ Integrations#

OpenVINO has been adopted by multiple AI projects in various areas. For an extensive list of community-based projects involving OpenVINO, see the Awesome OpenVINO repository.

Hugging Face Optimum-Intel


Grab and use models leveraging OpenVINO within the Hugging Face API. The repository hosts pre-optimized OpenVINO IR models, so that you can use them in your projects without the need for any adjustments.
Benefits:
- Minimize complex coding for Generative AI.
Check example code
-from transformers import AutoModelForCausalLM
+from optimum.intel.openvino import OVModelForCausalLM

from transformers import AutoTokenizer, pipeline
model_id = "togethercomputer/RedPajama-INCITE-Chat-3B-v1"

-model = AutoModelForCausalLM.from_pretrained(model_id)
+model = OVModelForCausalLM.from_pretrained(model_id, export=True)

OpenVINO Execution Provider for ONNX Runtime


Utilize OpenVINO as a backend with your existing ONNX Runtime code.
Benefits:
- Enhanced inference performance on Intel hardware with minimal code modifications.
Check example code
device = `CPU_FP32`
# Set OpenVINO as the Execution provider to infer this model
sess.set_providers([`OpenVINOExecutionProvider`], [{device_type` : device}])

Torch.compile with OpenVINO


Use OpenVINO for Python-native applications by JIT-compiling code into optimized kernels.
Benefits:
- Enhanced inference performance on Intel hardware with minimal code modifications.
Check example code
import openvino.torch

...
model = torch.compile(model, backend='openvino')
...

OpenVINO LLMs with LlamaIndex


Build context-augmented GenAI applications with the LlamaIndex framework and enhance runtime performance with OpenVINO.
Benefits:
- Minimize complex coding for Generative AI.
Check example code
ov_config = {
    "PERFORMANCE_HINT": "LATENCY",
    "NUM_STREAMS": "1",
    "CACHE_DIR": "",
}

ov_llm = OpenVINOLLM(
    model_id_or_path="HuggingFaceH4/zephyr-7b-beta",
    context_window=3900,
    max_new_tokens=256,
    model_kwargs={"ov_config": ov_config},
    generate_kwargs={"temperature": 0.7, "top_k": 50, "top_p": 0.95},
    messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    device_map="cpu",
)

OpenVINO Backend for ExecuTorch


Export and run AI models using OpenVINO with ExecuTorch to optimize performance on Intel hardware.
Benefits:
- Accelerate inference, reduce latency, and simplify deployment for efficient AI applications.
Check example code
python aot_optimize_and_infer.py --export --suite timm --model vgg16 --input_shape "[1, 3, 224, 224]" --device CPU

OpenVINO Integration for LangChain


Integrate OpenVINO with the LangChain framework to enhance runtime performance for GenAI applications.
Benefits:
- Streamline the integration and chaining of language models for efficient AI workflows.
Check example code
ov_llm = HuggingFacePipeline.from_model_id(
   model_id="ov_model_dir",
   task="text-generation",
   backend="openvino",
   model_kwargs={"device": "CPU", "ov_config": ov_config},
   pipeline_kwargs={"max_new_tokens": 10},
)

chain = prompt | ov_llm

question = "What is electroencephalography?"

print(chain.invoke({"question": question}))

Intel® Geti™


Build computer vision models faster with less data using Intel® Geti™. It streamlines labeling, training, and deployment, exporting models optimized for OpenVINO.
Benefits:
- Train with less data and deploy models faster.

AI Playground™


Use Intel® OpenVINO™ in AI Playground to optimize and run AI models efficiently on Intel CPUs and Arc GPUs, enabling local image generation, editing, and video processing. It supports OpenVINO-optimized models like TinyLlama, Mistral 7B, and Phi-3 mini, no conversion needed.
Benefits:
- Easily set up pre-optimized models.
- Run faster, hardware-accelerated inference with OpenVINO.

Intel® AI Assistant Builder


Run local AI assistants with Intel® AI Assistant Builder using OpenVINO-optimized models like Phi-3 and Qwen2.5. Build secure, efficient assistants customized for your data and workflows.
Benefits:
- Build custom assistants with agentic workflows and knowledge bases.
- Keep data secure by running fully local.