{"slug":"phi-4","id":"phi-4","type":"model","title":"Phi-4","description":"Microsoft's small-but-capable model using state-of-the-art training techniques and high-quality data. Punches well above its weight class despite small parameter count.","last_updated":"2026-04-10","last_verified":null,"verification_status":"unverified","markdown_url":"/content/models/phi-4.md","html_url":"/models/phi-4","api_url":"/api/v1/models/phi-4.json","content_hash":"61b985ed0b24781209146be778fbb08bdbbf287cf1bfe01c24d0cb7e81d8123c","sha256":"61b985ed0b24781209146be778fbb08bdbbf287cf1bfe01c24d0cb7e81d8123c","provider":"Microsoft","pricing":{"input":"Free (open weights)","output":"Free (open weights)","free":true,"note":"MIT license"},"benchmarks":{"reasoning":78,"coding":80,"math":79,"writing":77,"multilingual":72,"speed":92},"tags":["microsoft","open-source","text"],"website":"https://azure.microsoft.com/en-us/products/phi","release_date":"2025","relationships":{"links":[],"related":[{"id":"cohere-tiny-aya","title":"Cohere Tiny Aya 3.35B","type":"model","html_url":"/models/cohere-tiny-aya","markdown_url":"/content/models/cohere-tiny-aya.md","shared_tags":["open-source","text"],"score":4},{"id":"command-r-plus","title":"Command R+","type":"model","html_url":"/models/command-r-plus","markdown_url":"/content/models/command-r-plus.md","shared_tags":["open-source","text"],"score":4},{"id":"deepseek-r1","title":"DeepSeek R1","type":"model","html_url":"/models/deepseek-r1","markdown_url":"/content/models/deepseek-r1.md","shared_tags":["open-source","text"],"score":4},{"id":"deepseek-v3.2","title":"DeepSeek V3.2","type":"model","html_url":"/models/deepseek-v3.2","markdown_url":"/content/models/deepseek-v3.2.md","shared_tags":["open-source","text"],"score":4},{"id":"falcon-3","title":"Falcon 3","type":"model","html_url":"/models/falcon-3","markdown_url":"/content/models/falcon-3.md","shared_tags":["open-source","text"],"score":4},{"id":"gemma-3","title":"Gemma 3","type":"model","html_url":"/models/gemma-3","markdown_url":"/content/models/gemma-3.md","shared_tags":["open-source","text"],"score":4}],"explicit":{}},"metadata":{"title":"Phi-4","type":"model","id":"phi-4","provider":"Microsoft","model_type":"open-source","release_date":"2025","description":"Microsoft's small-but-capable model using state-of-the-art training techniques and high-quality data. Punches well above its weight class despite small parameter count.","last_updated":"2026-04-10","context_window":"16K tokens","website":"https://azure.microsoft.com/en-us/products/phi","license":"MIT","modality":["text"],"tags":["microsoft","open-source","text"],"pricing":{"input":"Free (open weights)","output":"Free (open weights)","free":true,"note":"MIT license"},"benchmarks":{"reasoning":78,"coding":80,"math":79,"writing":77,"multilingual":72,"speed":92},"parameters":"14B","hardware_requirements":"8GB VRAM (Q4); 12GB VRAM (FP16)","best_for":["Resource-constrained environments","Learning","Prototyping","Edge deployment"]},"content_text":"# Phi-4\n\nMicrosoft's proof that training data quality can beat parameter count. At just 14B parameters, Phi-4 scores 80 on coding and 79 on math -- numbers that models three times its size struggled to reach a generation ago. It runs on 8GB of VRAM with Q4 quantization, meaning virtually any modern GPU can handle it.\n\nThe speed score of 92/100 is the practical payoff. Phi-4 is fast enough for real-time applications where latency matters more than peak intelligence. Reasoning at 78 and writing at 77 are respectable for the size class. The weak point is multilingual at 72 -- Microsoft clearly optimized for English-first workloads.\n\nThe 16K context window is the hard constraint. In a landscape where 128K is common and 256K is appearing, 16K limits Phi-4 to shorter documents and conversations. This is fine for code completion, chat prototyping, and educational use, but rules it out for document-heavy enterprise workflows.\n\nMIT license and Microsoft backing give it strong institutional credibility. The model is a favorite for learning and experimentation -- small enough to iterate quickly, capable enough to produce useful results. Azure integration is seamless if you are in that ecosystem.\n\n**When to pick something else:** Gemma 4 E4B offers multimodal capability at a similar size with a much larger context window. Mistral Small 3 at 24B gives substantially better benchmarks while still fitting on a single RTX 4090. Phi-4 is best as a prototyping tool or when 8GB VRAM is genuinely all you have.","content_length":2385,"generated_at":"2026-04-24"}