Retrieval Augmented Generation Sample
This example showcases inference of Text Embedding Models. The application has limited configuration options to encourage the reader to explore and modify the source code. For example, change the device for inference to GPU. The sample features ov::genai::TextEmbeddingPipeline
and uses text as an input source.
Download and Convert the Model and Tokenizers
The --upgrade-strategy eager
option is needed to ensure optimum-intel
is upgraded to the latest version.
Install ../../export-requirements.txt to convert a model.
pip install --upgrade-strategy eager -r ../../export-requirements.txt
optimum-cli export openvino --trust-remote-code --model BAAI/bge-small-en-v1.5 BAAI/bge-small-en-v1.5
Run
Follow Get Started with Samples to run the sample.
text_embeddings BAAI/bge-small-en-v1.5 "Document 1" "Document 2"
See SUPPORTED_MODELS.md for the list of supported models.
Text Embedding Pipeline Usage
#include "openvino/genai/rag/text_embedding_pipeline.hpp"
ov::genai::TextEmbeddingPipeline pipeline(models_path, device, config);
std::vector<ov::genai::EmbeddingResult> embeddings = pipeline.embed_documents(documents);