Text to Image JavaScript Generation Pipeline
This example showcases inference of text-to-image diffusion models like Stable Diffusion, FLUX, and LCM. The application doesn't have many configuration options to encourage the reader to explore and modify the source code. For example, change the device for inference to GPU. The sample features Text2ImagePipeline from openvino-genai-node and uses a text prompt as input source.
Sample files:
text2image.jsdemonstrates basic usage of the text-to-image pipeline with a step callback and saves the result as a BMP file usingbmp-js
Users can change the sample code and play with the following generation parameters:
- Change width or height of generated image
- Generate multiple images per prompt (
num_images_per_prompt) - Adjust a number of inference steps (
num_inference_steps) - (SD 1.x, 2.x; SD3, SDXL) Add negative prompt when guidance scale > 1
- (SDXL, SD3, FLUX) Specify other positive prompts like
prompt_2 - Add a per-step callback to monitor progress or stop generation early
Download and convert the model
The --upgrade-strategy eager option is needed to ensure optimum-intel is upgraded to the latest version.
It's not required to install ../../export-requirements.txt for deployment if the model has already been exported.
pip install --upgrade-strategy eager -r ../../export-requirements.txt
Then, run the export with Optimum CLI:
optimum-cli export openvino --model dreamlike-art/dreamlike-anime-1.0 --task stable-diffusion --weight-format fp16 dreamlike_anime_1_0_ov/FP16
Run
From the samples/js directory, install dependencies (if not already done):
npm install
If you use the master branch, you may need to build openvino-genai-node from source first.
Run the sample:
node image_generation/text2image.js dreamlike_anime_1_0_ov/FP16 "cyberpunk cityscape like Tokyo New York with tall buildings at dusk golden hour cinematic lighting"
The result is saved as image.bmp in the current directory.
Optional: change device
The device is hardcoded to CPU in the sample. To use GPU, edit text2image.js and change:
const device = "CPU"; // GPU can be used as well
Examples
Prompt: cyberpunk cityscape like Tokyo New York with tall buildings at dusk golden hour cinematic lighting

Refer to the Supported Models for the list of supported models.
Run with a step callback
The sample registers a callback that prints progress to stdout on each denoising step. The callback can also stop generation early by returning true:
function callback(step, numSteps) {
process.stdout.write(`Step ${step + 1}/${numSteps}\r`);
return false; // return true to stop early
}
const imageTensor = await pipeline.generate(prompt, {
width: 512,
height: 512,
num_inference_steps: 20,
callback,
});