Generative AI workflow#
Generative AI is a specific area of Deep Learning models used for producing new and “original” data, based on input in the form of image, sound, or natural language text. Due to their complexity and size, generative AI pipelines are more difficult to deploy and run efficiently. OpenVINO™ simplifies the process and ensures high-performance integrations, with the following options:
Install the OpenVINO GenAI package and run generative models out of the box. With custom API and tokenizers, among other components, it manages the essential tasks such as the text generation loop, tokenization, and scheduling, offering ease of use and high performance.
Using Optimum Intel is a great way to experiment with different models and scenarios, thanks to a simple interface for the popular API and infrastructure offered by Hugging Face. It also enables weight compression with Neural Network Compression Framework (NNCF), as well as conversion on the fly. For integration with the final product it may offer lower performance, though.
The advantages of using OpenVINO for generative model deployment:
Proceed to guides on: