{"slug":"gpt-oss-120b","id":"gpt-oss-120b","type":"model","title":"GPT-OSS-120B","description":"OpenAI's first fully open-weight LLMs since GPT-2. Matches or surpasses o4-mini on core benchmarks. Can run on a single 80GB GPU. Optimized for vLLM, llama.cpp, and Ollama.","last_updated":"2026-04-10","last_verified":null,"verification_status":"unverified","markdown_url":"/content/models/gpt-oss-120b.md","html_url":"/models/gpt-oss-120b","api_url":"/api/v1/models/gpt-oss-120b.json","content_hash":"392ce26b17261a4f104b5aaff63173cd5bb4a68d71efe808331632dfbc3b7cc6","sha256":"392ce26b17261a4f104b5aaff63173cd5bb4a68d71efe808331632dfbc3b7cc6","provider":"OpenAI","pricing":{"input":"Free (open weights)","output":"Free (open weights)","free":true},"benchmarks":{"reasoning":85,"coding":86,"math":85,"writing":87,"multilingual":86,"speed":75},"tags":["openai","open-source","text"],"website":"https://openai.com","release_date":"2026","relationships":{"links":[],"related":[{"id":"gpt-5.4","title":"GPT-5.4","type":"model","html_url":"/models/gpt-5.4","markdown_url":"/content/models/gpt-5.4.md","shared_tags":["openai","text"],"score":6},{"id":"gpt-5.4-thinking","title":"GPT-5.4 Thinking","type":"model","html_url":"/models/gpt-5.4-thinking","markdown_url":"/content/models/gpt-5.4-thinking.md","shared_tags":["openai","text"],"score":6},{"id":"cohere-tiny-aya","title":"Cohere Tiny Aya 3.35B","type":"model","html_url":"/models/cohere-tiny-aya","markdown_url":"/content/models/cohere-tiny-aya.md","shared_tags":["open-source","text"],"score":4},{"id":"command-r-plus","title":"Command R+","type":"model","html_url":"/models/command-r-plus","markdown_url":"/content/models/command-r-plus.md","shared_tags":["open-source","text"],"score":4},{"id":"deepseek-r1","title":"DeepSeek R1","type":"model","html_url":"/models/deepseek-r1","markdown_url":"/content/models/deepseek-r1.md","shared_tags":["open-source","text"],"score":4},{"id":"deepseek-v3.2","title":"DeepSeek V3.2","type":"model","html_url":"/models/deepseek-v3.2","markdown_url":"/content/models/deepseek-v3.2.md","shared_tags":["open-source","text"],"score":4}],"explicit":{}},"metadata":{"title":"GPT-OSS-120B","type":"model","id":"gpt-oss-120b","provider":"OpenAI","model_type":"open-source","release_date":"2026","description":"OpenAI's first fully open-weight LLMs since GPT-2. Matches or surpasses o4-mini on core benchmarks. Can run on a single 80GB GPU. Optimized for vLLM, llama.cpp, and Ollama.","last_updated":"2026-04-10","context_window":"128K tokens","website":"https://openai.com","license":"OpenAI Open Weight License","modality":["text"],"tags":["openai","open-source","text"],"pricing":{"input":"Free (open weights)","output":"Free (open weights)","free":true},"benchmarks":{"reasoning":85,"coding":86,"math":85,"writing":87,"multilingual":86,"speed":75},"parameters":"120B","hardware_requirements":"1x H100 80GB (FP16); 1x RTX 4090 with Q4 quantization","best_for":["Enterprise self-hosting","OpenAI ecosystem compatibility","Production deployment","Fine-tuning"]},"content_text":"# GPT-OSS-120B\n\nThe model nobody thought OpenAI would release. After years of closed-source dominance, GPT-OSS-120B is OpenAI's first open-weight release since GPT-2 in 2019, and it matches or beats their own o4-mini across the board. The benchmarks are remarkably flat -- 85-87 across reasoning, coding, math, writing, and multilingual -- making it one of the most balanced models at any size.\n\nWhat makes this interesting is not the raw numbers (Qwen 3.5 and GLM-5 beat it on most benchmarks) but the ecosystem play. First-class optimization for vLLM, llama.cpp, and Ollama means deployment is trivially easy. If your team already knows OpenAI's API patterns, the mental model translates directly. A single H100 runs it at FP16, or an RTX 4090 handles Q4 quantization.\n\nThe writing score of 87 is quietly the best in its class among open models at this size, reflecting OpenAI's years of RLHF expertise. For teams that need polished, human-sounding output from a self-hosted model, this is hard to beat. The 120B dense architecture is less efficient than MoE alternatives, which explains the speed score of 75.\n\nThe OpenAI Open Weight License is more permissive than expected but still not Apache 2.0 -- read the fine print before building commercial products.\n\n**When to pick something else:** If raw performance matters more than ecosystem, Qwen 3.5 and GLM-5 are stronger on every technical benchmark. If you need maximum efficiency on consumer hardware, MoE models like Mistral Small 4 or Nemotron-Cascade 2 run circles around a 120B dense model.","content_length":2450,"generated_at":"2026-04-24"}