Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
microsoft
/
Phi-4-multimodal-instruct
like
1.26k
Follow
Microsoft
10.8k
Automatic Speech Recognition
Transformers
Safetensors
24 languages
phi4mm
text-generation
nlp
code
audio
speech-summarization
speech-translation
visual-question-answering
phi-4-multimodal
phi
phi-4-mini
custom_code
arxiv:
2503.01743
arxiv:
2407.13833
License:
mit
Model card
Files
Files and versions
Community
58
Train
Use this model
refs/pr/53
Phi-4-multimodal-instruct
11 contributors
History:
29 commits
cyrilvallez
HF staff
Upload folder using huggingface_hub
8bf5ecc
verified
8 days ago
examples
Add examples
about 1 month ago
figures
Added model files
about 1 month ago
.gitattributes
Safe
1.61 kB
added technical report
about 1 month ago
CODE_OF_CONDUCT.md
Safe
444 Bytes
Added model files
about 1 month ago
LICENSE
Safe
1.14 kB
Added model files
about 1 month ago
README.md
65.1 kB
Update readme
12 days ago
SECURITY.md
Safe
2.66 kB
Added model files
about 1 month ago
SUPPORT.md
Safe
1.24 kB
Added model files
about 1 month ago
adapter_config.json
742 Bytes
Upload folder using huggingface_hub
8 days ago
adapter_model.safetensors
923 MB
LFS
Upload folder using huggingface_hub
8 days ago
configuration_phi4mm.py
11 kB
Added model files
about 1 month ago
merges.txt
Safe
2.42 MB
Added model files
about 1 month ago
modeling_phi4mm.py
116 kB
fixes the asserion error when num_beams > 1 (#42)
20 days ago
phi_4_mm.tech_report.02252025.pdf
5.3 MB
LFS
added technical report
about 1 month ago
processing_phi4mm.py
32.8 kB
Added model files
about 1 month ago
sample_finetune_speech.py
Safe
16.7 kB
Fix bug with safe suffix removal (#34)
25 days ago
sample_finetune_vision.py
19.6 kB
Added model files
about 1 month ago
sample_inference_phi4mm.py
10.5 kB
Added model files
about 1 month ago
speech_conformer_encoder.py
111 kB
Added model files
about 1 month ago
vision_siglip_navit.py
78.2 kB
Added model files
about 1 month ago