---
title: "Choose a Local Model"
type: guide
id: "choose-a-local-model"
description: "A decision playbook for choosing local or open-weight AI models for private documents, self-hosting, offline workflows, and cost control."
last_updated: "2026-04-24"
tags:
- "local"
- "open-source"
- "models"
- "privacy"
- "playbook"
---

# Choose a Local Model

Local models are not automatically better. They are better when control, privacy, cost predictability, or offline use matters enough to justify hosting complexity.

## Short Answer

Check [Qwen 3.5](/models/qwen-3.5), [GLM-5](/models/glm-5), [MiniMax M2.7](/models/minimax-m2.7), [DeepSeek R1](/models/deepseek-r1), [Mistral 3](/models/mistral-3), and [Llama 4 Maverick](/models/llama-4-maverick). Use the smallest model that passes your eval.

## Decision Rules

| Situation | Pick |
|-----------|------|
| Best open coding shortlist | Qwen 3.5, GLM-5, MiniMax M2.7 |
| Open reasoning shortlist | DeepSeek R1, Qwen 3.5, Kimi K2.5 |
| Broad open ecosystem | Llama 4 Maverick |
| European/open deployment posture | Mistral 3 or Mistral Small |
| Small fast local tasks | Phi-4, Gemma, Falcon, SmolLM |

## Before Recommending Local

Verify:

- Hardware availability
- Latency requirement
- Context-window requirement
- License compatibility
- Data sensitivity
- Maintenance capacity
- Evaluation budget

## Agent Workflow

1. Fetch `/api/v1/recommend/local.json`.
2. Filter by license and model family.
3. Compare task score against proprietary defaults.
4. Recommend an evaluation set before production use.
5. Include total cost of ownership, not just token price.

## Failure Mode

The common mistake is recommending local models for privacy without asking whether the user can operate them well. Poorly hosted local AI can be slower, less secure, and more expensive than a managed API with the right controls.

Choose a Local Model

Local models are not automatically better. They are better when control, privacy, cost predictability, or offline use matters enough to justify hosting complexity.

Short Answer

Check Qwen 3.5, GLM-5, MiniMax M2.7, DeepSeek R1, Mistral 3, and Llama 4 Maverick. Use the smallest model that passes your eval.

Decision Rules

Situation	Pick
Best open coding shortlist	Qwen 3.5, GLM-5, MiniMax M2.7
Open reasoning shortlist	DeepSeek R1, Qwen 3.5, Kimi K2.5
Broad open ecosystem	Llama 4 Maverick
European/open deployment posture	Mistral 3 or Mistral Small
Small fast local tasks	Phi-4, Gemma, Falcon, SmolLM

Before Recommending Local

Verify:

Hardware availability
Latency requirement
Context-window requirement
License compatibility
Data sensitivity
Maintenance capacity
Evaluation budget

Agent Workflow

Fetch /api/v1/recommend/local.json.
Filter by license and model family.
Compare task score against proprietary defaults.
Recommend an evaluation set before production use.
Include total cost of ownership, not just token price.

Failure Mode

The common mistake is recommending local models for privacy without asking whether the user can operate them well. Poorly hosted local AI can be slower, less secure, and more expensive than a managed API with the right controls.