{"slug":"kimi-k2.5","id":"kimi-k2.5","type":"model","title":"Kimi K2.5","description":"Chinese AI model achieving 96% on AIME 2025, outperforming most proprietary models on math. Strong reasoning and mathematical capabilities.","last_updated":"2026-04-10","last_verified":null,"verification_status":"unverified","markdown_url":"/content/models/kimi-k2.5.md","html_url":"/models/kimi-k2.5","api_url":"/api/v1/models/kimi-k2.5.json","content_hash":"ab881ce9ca6e96be96c32c086c7b32453d9d521fa73913418d93fa7acd5e1520","sha256":"ab881ce9ca6e96be96c32c086c7b32453d9d521fa73913418d93fa7acd5e1520","provider":"Moonshot AI","pricing":{"input":"Free (self-hosted)","output":"Free (self-hosted)","free":true,"note":"Kimi API available"},"benchmarks":{"reasoning":93,"coding":85,"math":97,"writing":78,"multilingual":80,"speed":72},"tags":["moonshot ai","open-source","text"],"website":"https://www.moonshot.cn","release_date":"2025","relationships":{"links":[],"related":[{"id":"cohere-tiny-aya","title":"Cohere Tiny Aya 3.35B","type":"model","html_url":"/models/cohere-tiny-aya","markdown_url":"/content/models/cohere-tiny-aya.md","shared_tags":["open-source","text"],"score":4},{"id":"command-r-plus","title":"Command R+","type":"model","html_url":"/models/command-r-plus","markdown_url":"/content/models/command-r-plus.md","shared_tags":["open-source","text"],"score":4},{"id":"deepseek-r1","title":"DeepSeek R1","type":"model","html_url":"/models/deepseek-r1","markdown_url":"/content/models/deepseek-r1.md","shared_tags":["open-source","text"],"score":4},{"id":"deepseek-v3.2","title":"DeepSeek V3.2","type":"model","html_url":"/models/deepseek-v3.2","markdown_url":"/content/models/deepseek-v3.2.md","shared_tags":["open-source","text"],"score":4},{"id":"falcon-3","title":"Falcon 3","type":"model","html_url":"/models/falcon-3","markdown_url":"/content/models/falcon-3.md","shared_tags":["open-source","text"],"score":4},{"id":"gemma-3","title":"Gemma 3","type":"model","html_url":"/models/gemma-3","markdown_url":"/content/models/gemma-3.md","shared_tags":["open-source","text"],"score":4}],"explicit":{}},"metadata":{"title":"Kimi K2.5","type":"model","id":"kimi-k2.5","provider":"Moonshot AI","model_type":"open-source","release_date":"2025","description":"Chinese AI model achieving 96% on AIME 2025, outperforming most proprietary models on math. Strong reasoning and mathematical capabilities.","last_updated":"2026-04-10","context_window":"128K tokens","website":"https://www.moonshot.cn","license":"MIT","modality":["text"],"tags":["moonshot ai","open-source","text"],"pricing":{"input":"Free (self-hosted)","output":"Free (self-hosted)","free":true,"note":"Kimi API available"},"benchmarks":{"reasoning":93,"coding":85,"math":97,"writing":78,"multilingual":80,"speed":72},"parameters":"MoE (undisclosed)","hardware_requirements":"Multi-GPU setup required","best_for":["Mathematical reasoning","STEM applications","Scientific computing","Education"]},"content_text":"# Kimi K2.5\n\nThe math benchmark destroyer. Kimi K2.5 scores 97/100 on math and 96% on AIME 2025, numbers that beat most proprietary models including several that cost orders of magnitude more to run. Moonshot AI built a model that is absurdly specialized and absurdly good at that specialty.\n\nReasoning at 93/100 is genuinely impressive, and coding at 85 is solid enough for STEM workflows. But the drop-off is telling: writing sits at 78, multilingual at 80, and speed at 72. This is not a general-purpose model. It is a reasoning engine that happens to generate text.\n\nThe MoE architecture with undisclosed parameter counts and a \"multi-GPU setup required\" hardware spec make self-hosting opaque. Moonshot keeps the exact architecture close to the chest, which is unusual for an MIT-licensed model. The Kimi API is available if you prefer not to guess at infrastructure requirements.\n\nFor STEM education, scientific computing, and any pipeline where mathematical accuracy is the bottleneck, K2.5 is the open-source answer. It outperforms DeepSeek R1 on pure math while matching it on reasoning, and it does so under a permissive MIT license.\n\n**When to pick something else:** For anything involving prose, creative output, or general assistant tasks, K2.5 is the wrong tool. Qwen 3.5 or GLM-5 give you stronger all-around capability. If you need math reasoning on consumer hardware, Nemotron-Cascade 2 gets surprisingly close at a fraction of the compute.","content_length":2302,"generated_at":"2026-04-24"}