Skip to main content

OpenVINOGenAI

Run Generative AI with ease

OpenVINO™ GenAI provides optimized pipelines for running generative AI models with maximum performance and minimal dependencies

Features and Benefits

🚀

Optimized Performance

Built for speed with hardware-specific optimizations for Intel CPUs, GPUs, and NPUs. Advanced techniques like speculative decoding, KV-cache optimization, and other deliver maximum inference performance.

👨‍💻

Developer-Friendly APIs

Simple, intuitive APIs in both Python and C++ that hide complexity while providing full control. Get started with just a few lines of code, then customize with advanced features as needed.

📦

Production-Ready Pipelines

Pre-built pipelines for text generation, image creation, speech recognition, speech generation, and visual language processing. No need to build inference loops or handle tokenization - everything works out of the box.

🎨

Extensive Model Support

Compatible with popular models including Llama, Mistral, Phi, Qwen, Stable Diffusion, Flux, Whisper, etc. Easy model conversion from Hugging Face and ModelScope.

Lightweight & Efficient

Minimal dependencies and smaller disk footprint compared to heavyweight frameworks. Perfect for edge deployment, containers, and resource-constrained environments.

🖥️

Cross-Platform Compatibility

Run the same code on Linux, Windows, and macOS. Deploy across different hardware configurations without code changes - from laptops to data center servers.

Use Cases

Text Generation Using LLMs

Create chatbots, text summarization, content generation, and question-answering applications with state-of-the-art Large Language Models (LLMs).

Capabilities:
  • Control output with different generation parameters (sampling, temperature, etc.)
  • Optimize for conversational scenarios by using chat mode
  • Apply LoRA adapters and dynamically switch between them without recompilation
  • Accelerate generation using draft models via Speculative Decoding
import openvino_genai as ov_genai

pipe = ov_genai.LLMPipeline(model_path, "CPU")
print(pipe.generate("What is OpenVINO?", max_new_tokens=100))

Image Generation Using Diffusers

Create and modify images with diffusion models for art generation, product design, and creative applications using Stable Diffusion and similar architectures.

Capabilities:
  • Support for text-to-image, image-to-image, and inpainting pipelines
  • Control image generation by adjusting parameters (dimentions, iterations, etc.)
  • Apply LoRA adapters and dynamically switch between them for artistic styles and modifications
  • Generate multiple images per one request
import openvino_genai as ov_genai
from PIL import Image

pipe = ov_genai.Text2ImagePipeline(model_path, "CPU")
image_tensor = pipe.generate(prompt)

image = Image.fromarray(image_tensor.data[0])
image.save("image.bmp")

Speech Recognition Using Whisper

Convert speech to text using Whisper models for video transcription, meeting notes, multilingual audio content processing, and accessibility applications.

Capabilities:
  • Translate foreign language speech directly to English text
  • Transcribe audio in multiple languages with automatic language detection
  • Generate precise timestamps for synchronized subtitles and captions
  • Process long-form audio content (>30 seconds) efficiently
import openvino_genai as ov_genai
import librosa

def read_wav(filepath):
raw_speech, samplerate = librosa.load(filepath, sr=16000)
return raw_speech.tolist()

raw_speech = read_wav('sample.wav')

pipe = ov_genai.WhisperPipeline(model_path, "CPU")
result = pipe.generate(raw_speech, max_new_tokens=100)
print(result)

Image Processing Using VLMs

Analyze and describe images with Vision Language Models (VLMs) to build AI assistants and tools for legal document review, medical analysis, document processing, and visual content understanding applications.

Capabilities:
  • Process single or multiple images in a single prompt with detailed text descriptions
  • Optimize for conversational scenarios by using chat mode
  • Control output with different generation parameters (sampling, temperature, etc.)
import openvino_genai as ov_genai
import openvino as ov
from PIL import Image
import numpy as np
from pathlib import Path

def read_image(path: str) -> ov.Tensor:
pic = Image.open(path).convert("RGB")
image_data = np.array(pic)[None]
return ov.Tensor(image_data)

def read_images(path: str) -> list[ov.Tensor]:
entry = Path(path)
if entry.is_dir():
return [read_image(str(file)) for file in sorted(entry.iterdir())]
return [read_image(path)]

images = read_images("./images")

pipe = ov_genai.VLMPipeline(model_path, "CPU")
result = pipe.generate(prompt, images=images, max_new_tokens=100)
print(result.texts[0])
Looking for more? See all supported use cases.

Install OpenVINO™ GenAI

Unlock the power of OpenVINO GenAI™ for your projects.
Get started with seamless installation now!

Quick Installation from PyPi

python -m pip install openvino-genai

Operating Systems

Linux
Windows
macOS
Need more details?
Refer to the Getting Started Guide to learn more about OpenVINO GenAI.