GenAI
Deploy Generative AI with ease
OpenVINO™ GenAI provides developers the necessary tools to optimize and deploy Generative AI models
Text Generation API
ov_pipe = ov_genai.LLMPipeline("TinyLlama")
print(ov_pipe.generate("The Sun is yellow because"))
Image Generation API
ov_pipe = ov_genai.Text2ImagePipeline("Flux")
image = ov_pipe.generate("Create beautiful Sun")
Speech to Text API
ov_pipe = ov_genai.WhisperPipeline("whisper-base")
print(ov_pipe.generate(read_wav("sample.wav")))
Install OpenVINO™ GenAI
Unlock the power of OpenVINO GenAI™ for your projects.
Get started with seamless installation now!
Text generation API
An easy-to-use API for text generation can work with an LLM model to create chatbots, AI assistants like financial helpers, and AI tools like legal contract creators.

Possibilities
- Use different generation parameters (sampling types, etc.)
- Optimize for chat scenarios by using chat mode
- Load LoRA adapters and dynamically switch between them without recompilation
- Use draft model to accelerate generation via Speculative Decoding
- Python
- C++
import openvino_genai as ov_genai
pipe = ov_genai.LLMPipeline(model_path, "CPU")
print(pipe.generate("What is OpenVINO?", max_new_tokens=100))
#include "openvino/genai/llm_pipeline.hpp"
#include <iostream>
int main(int argc, char* argv[]) {
std::string models_path = argv[1];
ov::genai::LLMPipeline pipe(model_path, "CPU");
std::cout << pipe.generate("What is OpenVINO?", ov::genai::max_new_tokens(100)) << '\n';
}
Explore code samples Go to Documentation
Image generation API
A user-friendly image generation API can be used with generative models to improve creative tools and increase productivity. For instance, it can be utilized in furniture design tools to create various design concepts.

Possibilities
- Alter parameters (width, height, iterations) and compile model for static size
- Load LoRA adapters (in safetensor format) and dynamically switch between them
- Generate multiple images per one request
- Python
- C++
import argparse
from PIL import Image
import openvino_genai
def main():
parser = argparse.ArgumentParser()
parser.add_argument('model_dir')
parser.add_argument('prompt')
args = parser.parse_args()
device = 'CPU' # GPU, NPU can be used as well
pipe = openvino_genai.Text2ImagePipeline(args.model_dir, device)
image_tensor = pipe.generate(
args.prompt,
width=512,
height=512,
num_inference_steps=20
)
image = Image.fromarray(image_tensor.data[0])
image.save("image.bmp")
#include "openvino/genai/image_generation/text2image_pipeline.hpp"
#include "imwrite.hpp"
int main(int argc, char* argv[]) {
const std::string models_path = argv[1], prompt = argv[2];
const std::string device = "CPU"; // GPU, NPU can be used as well
ov::genai::Text2ImagePipeline pipe(models_path, device);
ov::Tensor image = pipe.generate(prompt,
ov::genai::width(512),
ov::genai::height(512),
ov::genai::num_inference_steps(20));
imwrite("image.bmp", image, true);
}
Explore code samples Go to Documentation
Speech to text API
An intuitive speech-to-text API can work with models like Whisper to enable use cases such as video transcription, enhancing communication tools.

Possibilities
- Translate transcription to English
- Predict timestamps
- Process Long-Form (>30 seconds) audio
- Python
- C++
import openvino_genai
import librosa
def read_wav(filepath):
raw_speech, samplerate = librosa.load(filepath, sr=16000)
return raw_speech.tolist()
device = "CPU" # GPU can be used as well
pipe = openvino_genai.WhisperPipeline("whisper-base", device)
raw_speech = read_wav("sample.wav")
print(pipe.generate(raw_speech))
#include <iostream>
#include "audio_utils.hpp"
#include "openvino/genai/whisper_pipeline.hpp"
int main(int argc, char* argv[]) {
std::filesystem::path models_path = argv[1];
std::string wav_file_path = argv[2];
std::string device = "CPU"; // GPU can be used as well
ov::genai::WhisperPipeline pipeline(models_path, device);
ov::genai::RawSpeechInput raw_speech = utils::audio::read_wav(wav_file_path);
std::cout << pipeline.generate(raw_speech, ov::genai::max_new_tokens(100)) << '\n';
}
Explore code samples Go to Documentation
Image processing with Visual Language Models
An easy-to-use API for vision language models can power chatbots, AI assistants like medical helpers, and AI tools like legal contract creators.

Possibilities
- Use different generation parameters (sampling types, etc.)
- Optimize for chat scenarios by using chat mode
- Pass multiple images to a model
- Python
- C++
import numpy as np
import openvino as ov
import openvino_genai as ov_genai
from PIL import Image
# Choose GPU instead of CPU in the line below to run the model on Intel integrated or discrete GPU
pipe = ov_genai.VLMPipeline("./MiniCPM-V-2_6/", "CPU")
image = Image.open("dog.jpg")
image_data = np.array(image.getdata()).reshape(1, image.size[1], image.size[0], 3).astype(np.uint8)
image_data = ov.Tensor(image_data)
prompt = "Can you describe the image?"
print(pipe.generate(prompt, image=image_data, max_new_tokens=100))
#include "load_image.hpp"
#include <openvino/genai/visual_language/pipeline.hpp>
#include <iostream>
int main(int argc, char* argv[]) {
std::string models_path = argv[1];
ov::genai::VLMPipeline pipe(models_path, "CPU");
ov::Tensor rgb = utils::load_image(argv[2]);
std::cout << pipe.generate(
prompt,
ov::genai::image(rgb),
ov::genai::max_new_tokens(100)
) << '\n';
}
Explore code samples Go to Documentation