Supported Models

info

Other models with similar architectures may also work successfully even if not explicitly validated. Consider testing any unlisted models to verify compatibility with your specific use case.

Large Language Models (LLMs)

Architecture	Models	Example HuggingFace Models
`AquilaModel`	Aquila	BAAI/Aquila-7B BAAI/AquilaChat-7B BAAI/Aquila2-7B BAAI/AquilaChat2-7B
`ArcticForCausalLM`	Snowflake	Snowflake/snowflake-arctic-instruct Snowflake/snowflake-arctic-base
`BaichuanForCausalLM`	Baichuan2	baichuan-inc/Baichuan2-7B-Chat baichuan-inc/Baichuan2-13B-Chat
`BloomForCausalLM`	Bloom	bigscience/bloom-560m bigscience/bloom-1b1 bigscience/bloom-1b7 bigscience/bloom-3b bigscience/bloom-7b1
`BloomForCausalLM`	Bloomz	bigscience/bloomz-560m bigscience/bloomz-1b1 bigscience/bloomz-1b7 bigscience/bloomz-3b bigscience/bloomz-7b1
`ChatGLMModel`	ChatGLM	THUDM/chatglm2-6b THUDM/chatglm3-6b THUDM/glm-4-9b THUDM/glm-4-9b-chat
`CodeGenForCausalLM`	CodeGen	Salesforce/codegen-350m-multi Salesforce/codegen-2B-multi Salesforce/codegen-6B-multi Salesforce/codegen-16B-multi Salesforce/codegen-350m-mono Salesforce/codegen-2B-mono Salesforce/codegen-6B-mono Salesforce/codegen-16B-mono Salesforce/codegen2-1B_P Salesforce/codegen2-3_7B_P Salesforce/codegen2-7B_P Salesforce/codegen2-16B_P
`CohereForCausalLM`	Aya	CohereLabs/aya-23-8B CohereLabs/aya-expanse-8b CohereLabs/aya-23-35B
`CohereForCausalLM`	C4AI Command R	CohereLabs/c4ai-command-r7b-12-2024 CohereLabs/c4ai-command-r-v01
`DbrxForCausalLM`	DBRX	databricks/dbrx-instruct databricks/dbrx-base
`DeciLMForCausalLM`	DeciLM	Deci/DeciLM-7B Deci/DeciLM-7B-instruct
`DeepseekForCausalLM`	DeepSeek-MoE	deepseek-ai/deepseek-moe-16b-base deepseek-ai/deepseek-moe-16b-chat
`DeepseekV2ForCausalLM`	DeepSeekV2	deepseek-ai/DeepSeek-V2-Lite deepseek-ai/DeepSeek-V2-Lite-Chat deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
`DeepseekV3ForCausalLM`	DeepSeekV3	deepseek-ai/DeepSeek-V3 deepseek-ai/DeepSeek-V3-Base deepseek-ai/DeepSeek-R1 deepseek-ai/DeepSeek-R1-Zero
`ExaoneForCausalLM`	Exaone	LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct LGAI-EXAONE/EXAONE-3.5-32B-Instruct LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct
`FalconForCausalLM`	Falcon	tiiuae/falcon-11B tiiuae/falcon-7b tiiuae/falcon-7b-instruct tiiuae/falcon-40b tiiuae/falcon-40b-instruct
`GemmaForCausalLM`	Gemma	google/gemma-2b google/gemma-2b-it google/gemma-1.1-2b-it google/codegemma-2b google/codegemma-1.1-2b google/gemma-7b google/gemma-7b-it google/gemma-1.1-7b-it google/codegemma-7b google/codegemma-7b-it google/codegemma-1.1-7b-it
`Gemma2ForCausalLM`	Gemma2	google/gemma-2-2b google/gemma-2-2b-it google/gemma-2-9b google/gemma-2-9b-it google/gemma-2-27b google/gemma-2-27b-it
`Gemma3ForCausalLM`	Gemma3	google/gemma-3-1b-it google/gemma-3-1b-pt
`GlmForCausalLM`	GLM	THUDM/glm-edge-1.5b-chat THUDM/glm-edge-4b-chat THUDM/glm-4-9b-hf THUDM/glm-4-9b-chat-hf THUDM/glm-4-9b-chat-1m-hf
`GPT2LMHeadModel`	GPT2	openai-community/gpt2 openai-community/gpt2-medium openai-community/gpt2-large openai-community/gpt2-xl distilbert/distilgpt2
`GPT2LMHeadModel`	CodeParrot	codeparrot/codeparrot-small codeparrot/codeparrot-small-code-to-text codeparrot/codeparrot-small-text-to-code codeparrot/codeparrot-small-multi codeparrot/codeparrot
`GPTBigCodeForCausalLM`	StarCoder	bigcode/starcoderbase-1b bigcode/starcoderbase-3b bigcode/starcoderbase-7b bigcode/starcoderbase bigcode/starcoder bigcode/octocoder HuggingFaceH4/starchat-alpha HuggingFaceH4/starchat-beta
`GPTJForCausalLM`	GPT-J	EleutherAI/gpt-j-6b crumb/Instruct-GPT-J
`GPTNeoForCausalLM`	GPT Neo	EleutherAI/gpt-neo-1.3B EleutherAI/gpt-neo-2.7B
`GPTNeoXForCausalLM`	GPT NeoX	EleutherAI/gpt-neox-20b
	Dolly	databricks/dolly-v2-3b databricks/dolly-v2-7b databricks/dolly-v2-12b
	RedPajama	ikala/redpajama-3b-chat togethercomputer/RedPajama-INCITE-Chat-3B-v1 togethercomputer/RedPajama-INCITE-Instruct-3B-v1 togethercomputer/RedPajama-INCITE-7B-Chat togethercomputer/RedPajama-INCITE-7B-Instruct
`GPTNeoXJapaneseForCausalLM`	GPT NeoX Japanese	abeja/gpt-neox-japanese-2.7b
`GraniteForCausalLM`	Granite	ibm-granite/granite-3.2-2b-instruct ibm-granite/granite-3.2-8b-instruct ibm-granite/granite-3.1-2b-instruct ibm-granite/granite-3.1-8b-instruct ibm-granite/granite-3.0-2b-instruct ibm-granite/granite-3.0-8b-instruct
`GraniteMoeForCausalLM`	GraniteMoE	ibm-granite/granite-3.1-1b-a400m-instruct ibm-granite/granite-3.1-3b-a800m-instruct ibm-granite/granite-3.0-1b-a400m-instruct ibm-granite/granite-3.0-3b-a800m-instruct
`InternLMForCausalLM`	InternLM	internlm/internlm-chat-7b internlm/internlm-7b
`InternLM2ForCausalLM`	InternLM2	internlm/internlm2-chat-1_8b internlm/internlm2-1_8b internlm/internlm2-chat-7b internlm/internlm2-7b internlm/internlm2-chat-20b internlm/internlm2-20b internlm/internlm2_5-1_8b-chat internlm/internlm2_5-1_8b internlm/internlm2_5-7b-chat internlm/internlm2_5-7b internlm/internlm2_5-20b-chat internlm/internlm2_5-20b
`JAISLMHeadModel`	Jais	inceptionai/jais-13b-chat inceptionai/jais-13b
`LlamaForCausalLM`	Llama 3	meta-llama/Llama-3.2-1B meta-llama/Llama-3.2-1B-Instruct meta-llama/Llama-3.2-3B meta-llama/Llama-3.2-3B-Instruct meta-llama/Llama-3.1-8B meta-llama/Llama-3.1-8B-Instruct meta-llama/Meta-Llama-3-8B meta-llama/Meta-Llama-3-8B-Instruct meta-llama/Llama-3.3-70B-Instruct meta-llama/Llama-3.1-70B meta-llama/Llama-3.1-70B-Instruct meta-llama/Meta-Llama-3-70B meta-llama/Meta-Llama-3-70B-Instruct deepseek-ai/DeepSeek-R1-Distill-Llama-8B deepseek-ai/DeepSeek-R1-Distill-Llama-70B
	Llama 2	meta-llama/Llama-2-13b-chat-hf meta-llama/Llama-2-13b-hf meta-llama/Llama-2-7b-chat-hf meta-llama/Llama-2-7b-hf meta-llama/Llama-2-70b-chat-hf meta-llama/Llama-2-70b-hf microsoft/Llama2-7b-WhoIsHarryPotter
	Falcon3	tiiuae/Falcon3-1B-Instruct tiiuae/Falcon3-1B-Base tiiuae/Falcon3-3B-Instruct tiiuae/Falcon3-3B-Base tiiuae/Falcon3-7B-Instruct tiiuae/Falcon3-7B-Base tiiuae/Falcon3-10B-Instruct tiiuae/Falcon3-10B-Base
	OpenLLaMA	openlm-research/open_llama_13b openlm-research/open_llama_3b openlm-research/open_llama_3b_v2 openlm-research/open_llama_7b openlm-research/open_llama_7b_v2
	TinyLlama	TinyLlama/TinyLlama-1.1B-Chat-v1.0
`MPTForCausalLM`	MPT	mosaicml/mpt-7b mosaicml/mpt-7b-instruct mosaicml/mpt-7b-chat mosaicml/mpt-30b mosaicml/mpt-30b-instruct mosaicml/mpt-30b-chat
`MiniCPMForCausalLM`	MiniCPM	openbmb/MiniCPM-1B-sft-bf16 openbmb/MiniCPM-2B-dpo-fp16 openbmb/MiniCPM-2B-sft-fp32 openbmb/MiniCPM-2B-dpo-fp32 openbmb/MiniCPM-2B-sft-bf16 openbmb/MiniCPM-2B-dpo-bf16
`MiniCPM3ForCausalLM`	MiniCPM3	openbmb/MiniCPM3-4B
`MistralForCausalLM`	Mistral	mistralai/Mistral-7B-Instruct-v0.1 mistralai/Mistral-7B-Instruct-v0.2 mistralai/Mistral-7B-Instruct-v0.3 mistralai/Mistral-Nemo-Instruct-2407 mistralai/Mistral-Nemo-Base-2407 mistralai/Mistral-7B-v0.1 mistralai/Mistral-7B-v0.3
	Notus	argilla/notus-7b-v1
	Zephyr	HuggingFaceH4/zephyr-7b-beta
	Neural Chat	Intel/neural-chat-7b-v3-3 Intel/neural-chat-7b-v3-2 Intel/neural-chat-7b-v3-1 Intel/neural-chat-7b-v3
`MixtralForCausalLM`	Mixtral	mistralai/Mixtral-8x7B-Instruct-v0.1 mistralai/Mixtral-8x7B-v0.1
`OlmoForCausalLM`	OLMo	allenai/OLMo-1B-hf allenai/OLMo-7B-hf allenai/OLMo-7B-Twin-2T-hf allenai/OLMo-7B-Instruct-hf allenai/OLMo-7B-0724-Instruct-hf allenai/OLMo-7B-0724-SFT-hf
`OPTForCausalLM`	OPT	facebook/opt-125m facebook/opt-350m facebook/opt-1.3b facebook/opt-2.7b facebook/opt-6.7b facebook/opt-13b
`OrionForCausalLM`	Orion	OrionStarAI/Orion-14B-Chat OrionStarAI/Orion-14B-LongChat OrionStarAI/Orion-14B-Base
`PhiForCausalLM`	Phi	microsoft/phi-2 microsoft/phi-1_5
`Phi3ForCausalLM`	Phi3	microsoft/Phi-3-mini-4k-instruct microsoft/Phi-3-mini-128k-instruct microsoft/Phi-3-medium-4k-instruct microsoft/Phi-3-medium-128k-instruct microsoft/Phi-3.5-mini-instruct microsoft/Phi-4-mini-instruct microsoft/phi-4 microsoft/Phi-4-reasoning
`PhimoeForCausalLM`	Phi-3.5-MoE	microsoft/Phi-3.5-MoE-instruct
`QWenLMHeadModel`	Qwen	Qwen/Qwen-1_8B-Chat Qwen/Qwen-1_8B-Chat-Int4 Qwen/Qwen-1_8B Qwen/Qwen-7B-Chat Qwen/Qwen-7B-Chat-Int4 Qwen/Qwen-7B Qwen/Qwen-14B-Chat Qwen/Qwen-14B-Chat-Int4 Qwen/Qwen-14B Qwen/Qwen-72B-Chat Qwen/Qwen-72B-Chat-Int4 Qwen/Qwen-72B
`Qwen2ForCausalLM`	Qwen2	Qwen/Qwen2.5-0.5B-Instruct Qwen/Qwen2.5-1.5B-Instruct Qwen/Qwen2.5-3B-Instruct Qwen/Qwen2.5-7B-Instruct Qwen/Qwen2.5-14B-Instruct Qwen/Qwen2.5-32B-Instruct Qwen/Qwen2.5-72B-Instruct Qwen/Qwen2-0.5B-Instruct Qwen/Qwen2-1.5B-Instruct Qwen/Qwen2-7B-Instruct Qwen/Qwen2-72B-Instruct Qwen/Qwen1.5-0.5B-Chat Qwen/Qwen1.5-1.8B-Chat Qwen/Qwen1.5-4B-Chat Qwen/Qwen1.5-7B-Chat Qwen/Qwen1.5-14B-Chat Qwen/Qwen1.5-32B-Chat Qwen/Qwen1.5-7B-Chat-GPTQ-Int4 Qwen/QwQ-32B deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B deepseek-ai/DeepSeek-R1-Distill-Qwen-7B deepseek-ai/DeepSeek-R1-Distill-Qwen-14B deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
`Qwen2MoeForCausalLM`	Qwen2MoE	Qwen/Qwen2-57B-A14B-Instruct Qwen/Qwen2-57B-A14B Qwen/Qwen1.5-MoE-A2.7B-Chat Qwen/Qwen1.5-MoE-A2.7B
`Qwen3ForCausalLM`	Qwen3	Qwen/Qwen3-0.6B Qwen/Qwen3-1.7B Qwen/Qwen3-4B Qwen/Qwen3-8B Qwen/Qwen3-14B Qwen/Qwen3-32B Qwen/Qwen3-0.6B-Base Qwen/Qwen3-1.7B-Base Qwen/Qwen3-4B-Base Qwen/Qwen3-8B-Base Qwen/Qwen3-14B-Base
`Qwen3MoeForCausalLM`	Qwen3MoE	Qwen/Qwen3-30B-A3B Qwen/Qwen3-30B-A3B-Base
`StableLmForCausalLM`	StableLM	stabilityai/stablelm-zephyr-3b stabilityai/stablelm-2-1_6b stabilityai/stablelm-2-12b stabilityai/stablelm-2-zephyr-1_6b stabilityai/stablelm-3b-4e1t
`Starcoder2ForCausalLM`	Startcoder2	bigcode/starcoder2-3b bigcode/starcoder2-7b bigcode/starcoder2-15b
`XGLMForCausalLM`	XGLM	facebook/xglm-564M facebook/xglm-1.7B facebook/xglm-2.9B facebook/xglm-4.5B facebook/xglm-7.5B
`XverseForCausalLM`	Xverse	xverse/XVERSE-7B xverse/XVERSE-7B-Chat xverse/XVERSE-13B xverse/XVERSE-13B-Chat xverse/XVERSE-65B xverse/XVERSE-65B-Chat

info

LoRA adapters are supported.

info

The pipeline can work with other similar topologies produced by optimum-intel with the same model signature. The model is required to have the following inputs after the conversion:

input_ids contains the tokens.
attention_mask is filled with 1.
beam_idx selects beams.
position_ids (optional) encodes a position of currently generating token in the sequence and a single logits output.

note

Models should belong to the same family and have the same tokenizers.

Image Generation Models

Architecture	Text to Image	Image to Image	Inpainting	LoRA Support	Example HuggingFace Models
`Latent Consistency Model`	✅	✅	✅	✅	SimianLuo/LCM_Dreamshaper_v7
`Stable Diffusion`	✅	✅	✅	✅	CompVis/stable-diffusion-v1-1 CompVis/stable-diffusion-v1-2 CompVis/stable-diffusion-v1-3 CompVis/stable-diffusion-v1-4 junnyu/stable-diffusion-v1-4-paddle jcplus/stable-diffusion-v1-5 stable-diffusion-v1-5/stable-diffusion-v1-5 botp/stable-diffusion-v1-5 dreamlike-art/dreamlike-anime-1.0 stabilityai/stable-diffusion-2 stabilityai/stable-diffusion-2-base stabilityai/stable-diffusion-2-1 bguisard/stable-diffusion-nano-2-1 justinpinkney/pokemon-stable-diffusion stablediffusionapi/architecture-tuned-model IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1 ZeroCool94/stable-diffusion-v1-5 pcuenq/stable-diffusion-v1-4 rinna/japanese-stable-diffusion benjamin-paine/stable-diffusion-v1-5 philschmid/stable-diffusion-v1-4-endpoints naclbit/trinart_stable_diffusion_v2 Fictiverse/Stable_Diffusion_PaperCut_Model
`Stable Diffusion Inpainting`	❌	❌	✅	✅	stabilityai/stable-diffusion-2-inpainting stable-diffusion-v1-5/stable-diffusion-inpainting botp/stable-diffusion-v1-5-inpainting parlance/dreamlike-diffusion-1.0-inpainting
`Stable Diffusion XL`	✅	✅	✅	✅	stabilityai/stable-diffusion-xl-base-0.9 stabilityai/stable-diffusion-xl-base-1.0 stabilityai/sdxl-turbo cagliostrolab/animagine-xl-4.0
`Stable Diffusion XL Inpainting`	❌	❌	✅	✅	diffusers/stable-diffusion-xl-1.0-inpainting-0.1
`Stable Diffusion 3`	✅	✅	✅	❌	stabilityai/stable-diffusion-3-medium-diffusers stabilityai/stable-diffusion-3.5-medium stabilityai/stable-diffusion-3.5-large stabilityai/stable-diffusion-3.5-large-turbo tensorart/stable-diffusion-3.5-medium-turbo tensorart/stable-diffusion-3.5-large-TurboX
`Flux`	✅	✅	✅	❌	black-forest-labs/FLUX.1-schnell Freepik/flux.1-lite-8B-alpha black-forest-labs/FLUX.1-dev shuttleai/shuttle-3-diffusion shuttleai/shuttle-3.1-aesthetic shuttleai/shuttle-jaguar Shakker-Labs/AWPortrait-FL black-forest-labs/FLUX.1-Fill-dev

Visual Language Models (VLMs)

Architecture	Models	LoRA Support	Example HuggingFace Models
`InternVLChat`	InternVLChatModel (Notes)	❌	OpenGVLab/InternVL2-1B OpenGVLab/InternVL2-2B OpenGVLab/InternVL2-4B OpenGVLab/InternVL2-8B OpenGVLab/InternVL2_5-1B OpenGVLab/InternVL2_5-2B OpenGVLab/InternVL2_5-4B OpenGVLab/InternVL2_5-8B OpenGVLab/InternVL3-1B OpenGVLab/InternVL3-2B OpenGVLab/InternVL3-8B OpenGVLab/InternVL3-9B OpenGVLab/InternVL3-14B
`LLaVA`	LLaVA-v1.5	❌	llava-hf/llava-1.5-7b-hf
`LLaVA-NeXT`	LLaVA-v1.6	❌	llava-hf/llava-v1.6-mistral-7b-hf llava-hf/llava-v1.6-vicuna-7b-hf llava-hf/llama3-llava-next-8b-hf
`MiniCPMV`	MiniCPM-V-2_6	❌	openbmb/MiniCPM-V-2_6
`Phi3VForCausalLM`	phi3_v (Notes)	❌	microsoft/Phi-3-vision-128k-instruct microsoft/Phi-3.5-vision-instruct
`Phi4MMForCausalLM`	phi4mm	❌	microsoft/Phi-4-multimodal-instruct
`Qwen2-VL`	Qwen2-VL	❌	Qwen/Qwen2-VL-2B-Instruct Qwen/Qwen2-VL-7B-Instruct Qwen/Qwen2-VL-2B Qwen/Qwen2-VL-7B
`Qwen2.5-VL`	Qwen2.5-VL	❌	Qwen/Qwen2.5-VL-3B-Instruct Qwen/Qwen2.5-VL-7B-Instruct

VLM Models Notes

InternVL2

To convert InternVL2 models, timm and einops are required:

pip install timm einops

phi3_v

Example models' configs aren't consistent. It's required to override the default eos_token_id with the one from a tokenizer:
```
generation_config.set_eos_token_id(pipe.get_tokenizer().get_eos_token_id())
```

Speech Recognition Models (Whisper-based)

Architecture	Models	LoRA Support	Example HuggingFace Models
`WhisperForConditionalGeneration`	Whisper	❌	openai/whisper-tiny openai/whisper-tiny.en openai/whisper-base openai/whisper-base.en openai/whisper-small openai/whisper-small.en openai/whisper-medium openai/whisper-medium.en openai/whisper-large-v3
`WhisperForConditionalGeneration`	Distil-Whisper	❌	distil-whisper/distil-small.en distil-whisper/distil-medium.en distil-whisper/distil-large-v3

Text Embeddings Models

Architecture	LoRA Support	Example HuggingFace Models
`BertModel`	❌	BAAI/bge-small-en-v1.5 BAAI/bge-base-en-v1.5 BAAI/bge-large-en-v1.5 sentence-transformers/all-MiniLM-L12-v2 mixedbread-ai/mxbai-embed-large-v1 mixedbread-ai/mxbai-embed-xsmall-v1 WhereIsAI/UAE-Large-V1
`MPNetForMaskedLM`	❌	sentence-transformers/all-mpnet-base-v2 sentence-transformers/multi-qa-mpnet-base-dot-v1
`RobertaForMaskedLM`	❌	sentence-transformers/all-distilroberta-v1
`XLMRobertaModel`	❌	mixedbread-ai/deepset-mxbai-embed-de-large-v1 intfloat/multilingual-e5-large-instruct intfloat/multilingual-e5-large

Speech Generation Models

Architecture	Models	LoRA Support	Example HuggingFace Models
`SpeechT5ForTextToSpeech`	SpeechT5 TTS	❌	microsoft/speecht5_tts

info

Some models may require access request submission on the Hugging Face page to be downloaded.

If https://huggingface.co/ is down, the conversion step won't be able to download the models.

Large Language Models (LLMs)​

Image Generation Models​

Visual Language Models (VLMs)​

InternVL2​

phi3_v​

Speech Recognition Models (Whisper-based)​

Text Embeddings Models​

Speech Generation Models​

Large Language Models (LLMs)

Image Generation Models

Visual Language Models (VLMs)

InternVL2

phi3_v

Speech Recognition Models (Whisper-based)

Text Embeddings Models

Speech Generation Models