Whisper Automatic Speech Recognition C Sample

Download OpenVINO GenAI
Build Samples
Download and Convert the Model
Prepare Audio File
Sample Description
Troubleshooting
Support and Contribution

Download OpenVINO GenAI

Download and extract OpenVINO GenAI Archive Visit the OpenVINO Download Page.

Build Samples

Set up the environment and build the samples Linux and macOS:

source <INSTALL_DIR>/setupvars.sh
./<INSTALL_DIR>/samples/c/build_samples.sh

Windows Command Prompt:

<INSTALL_DIR>\setupvars.bat
<INSTALL_DIR>\samples\c\build_samples_msvc.bat

Windows PowerShell:

.<INSTALL_DIR>\setupvars.ps1
.<INSTALL_DIR>\samples\c\build_samples.ps1

Download and Convert the Model

The --upgrade-strategy eager option is needed to ensure optimum-intel is upgraded to the latest version.

Install ../../export-requirements.txt if model conversion is required.

pip install --upgrade-strategy eager -r ../../export-requirements.txt
optimum-cli export openvino --trust-remote-code --model openai/whisper-tiny whisper-tiny

If a converted model in OpenVINO IR format is available in the OpenVINO optimized models collection on Hugging Face, you can download it directly via huggingface-cli.

For example:

pip install huggingface-hub
huggingface-cli download OpenVINO/whisper-tiny-int8-ov --local-dir whisper-tiny-int8-ov

Prepare audio file

Prepare audio file in wav format with sampling rate 16k Hz.

You can download example audio file: https://storage.openvinotoolkit.org/models_contrib/speech/2021.2/librispeech_s5/how_are_you_doing_today.wav

Sample Description

This example showcases inference of speech recognition Whisper Models using the OpenVINO GenAI C API. The sample features ov_genai_whisper_pipeline and uses audio files in WAV format as input.

Run Command

./whisper_speech_recognition_c <MODEL_DIR> "<WAV_FILE_PATH>" [DEVICE]

Parameters

MODEL_DIR: Path to the converted Whisper model directory
WAV_FILE_PATH: Path to the WAV audio file (use quotes if path contains spaces)
DEVICE: Optional - device to run inference on (default: "CPU")

Example Usage

./whisper_speech_recognition_c whisper-tiny how_are_you_doing_today.wav

Expected Output

 How are you doing today?
timestamps: [0.00, 2.00] text:  How are you doing today?

The sample will:

Load the WAV audio file and validate its format
Automatically resample to 16kHz if needed
Perform speech-to-text transcription
Output the full transcription
Display word-level timestamps for each text chunk

Troubleshooting

Empty or Incorrect Output

If you get empty or incorrect transcription results:

Ensure your audio file is in WAV format
Check that the audio contains clear speech

Model Loading Errors

If the model fails to load:

Verify the model path exists and contains valid Whisper model files
Ensure the model was properly converted to OpenVINO IR format
Check that the specified device (CPU, GPU, etc.) is available on your system

Audio File Errors

The sample provides detailed error messages for common audio file issues:

File not found
Permission denied
Invalid WAV format
Unsupported audio encoding (only PCM is supported)
Multi-channel audio (only mono is supported)

Support and Contribution

For troubleshooting, consult the OpenVINO documentation.
To report issues or contribute, visit the GitHub repository.

Table of Contents​

Download OpenVINO GenAI​

Build Samples​

Download and Convert the Model​

Prepare audio file​

Sample Description​

Run Command​

Parameters​

Example Usage​

Expected Output​

Troubleshooting​

Empty or Incorrect Output​

Model Loading Errors​

Audio File Errors​

Support and Contribution​

Table of Contents