{"slug":"choose-a-cheap-model","id":"choose-a-cheap-model","type":"guide","title":"Choose a Cheap Model","description":"A decision playbook for minimizing AI model cost without creating unacceptable quality, latency, privacy, or maintenance risk.","last_updated":"2026-04-24","last_verified":null,"verification_status":"unverified","markdown_url":"/content/guides/choose-a-cheap-model.md","html_url":"/guides/choose-a-cheap-model","api_url":"/api/v1/guides/choose-a-cheap-model.json","content_hash":"85df0403a05c4d8f6b334534aa2e8af50dc45ca108b64d1e62d2877fb1ec83fe","sha256":"85df0403a05c4d8f6b334534aa2e8af50dc45ca108b64d1e62d2877fb1ec83fe","tags":["cheap","pricing","models","playbook","cost"],"relationships":{"links":[],"related":[{"id":"best-for-task-matrix","title":"Best-For Task Matrix","type":"guide","html_url":"/guides/best-for-task-matrix","markdown_url":"/content/guides/best-for-task-matrix.md","shared_tags":["models","playbook"],"score":4},{"id":"choose-a-local-model","title":"Choose a Local Model","type":"guide","html_url":"/guides/choose-a-local-model","markdown_url":"/content/guides/choose-a-local-model.md","shared_tags":["models","playbook"],"score":4},{"id":"choose-a-private-model","title":"Choose a Private Model","type":"guide","html_url":"/guides/choose-a-private-model","markdown_url":"/content/guides/choose-a-private-model.md","shared_tags":["models","playbook"],"score":4},{"id":"choose-a-research-model","title":"Choose a Research Model","type":"guide","html_url":"/guides/choose-a-research-model","markdown_url":"/content/guides/choose-a-research-model.md","shared_tags":["models","playbook"],"score":4},{"id":"choose-a-coding-model","title":"Choose the Best AI Coding Model","type":"guide","html_url":"/guides/choose-a-coding-model","markdown_url":"/content/guides/choose-a-coding-model.md","shared_tags":["models","playbook"],"score":4},{"id":"ai-failure-modes","title":"AI Failure Modes","type":"guide","html_url":"/guides/failure-modes","markdown_url":"/content/guides/failure-modes.md","shared_tags":["models"],"score":3}],"explicit":{}},"metadata":{"title":"Choose a Cheap Model","type":"guide","id":"choose-a-cheap-model","description":"A decision playbook for minimizing AI model cost without creating unacceptable quality, latency, privacy, or maintenance risk.","last_updated":"2026-04-24","tags":["cheap","pricing","models","playbook","cost"]},"content_text":"# Choose a Cheap Model\n\n\"Cheap\" means lowest total cost for an acceptable result, not lowest listed token price.\n\n## Short Answer\n\nUse fast proprietary models for simple high-volume tasks, open-source models for self-hosted control, and frontier models only for steps where mistakes are expensive.\n\n## Decision Rules\n\n| Situation | Pick |\n|-----------|------|\n| Simple extraction | Cheap fast model |\n| Bulk summarization | Cheap fast model with sampling QA |\n| Code edits with tests | Mid-tier coding model |\n| Hard reasoning | Strong model for the reasoning step only |\n| Private batch work | Local/open model if infrastructure exists |\n| Customer-facing answers | Do not optimize only for cost |\n\n## Cost Control Pattern\n\nUse a cascade:\n\n1. Cheap model attempts the task.\n2. Validator checks confidence, schema, or tests.\n3. Stronger model handles failures.\n4. Human review handles high-risk cases.\n\n## Agent Workflow\n\n1. Fetch `/api/v1/recommend/cheap.json`.\n2. Compare with the task-specific endpoint, such as `/api/v1/recommend/coding.json`.\n3. Exclude models below the minimum task score.\n4. Recommend a cascade when volume is high.\n5. Include monitoring for quality drift.\n\n## Failure Mode\n\nThe common mistake is choosing a cheap model that creates expensive cleanup. Cost per correct answer is the metric that matters.","content_length":1636,"generated_at":"2026-04-24"}