Cheapest VPS for Ollama in 2026 — clawdVPS

Finding the cheapest server for Ollama isn't just about the monthly price. A server that's €2/mo cheaper but 3× slower costs you more in productivity. We tested five providers over 30 days with identical workloads on Qwen 2.5 7B (Q4_K_M).

The Real Cost Comparison

Provider	Plan	RAM	Price	TPS*	Setup
Hetzner CX22	2 vCPU	4 GB	€5.89/mo	2–3 (3B only)	Easy
Hetzner CX32	4 vCPU	8 GB	€6.80/mo	8–12 (7B)	Easy
Contabo VPS S	4 vCPU	6 GB	$5.99/mo	~4 (3B)	Medium
DigitalOcean	2 vCPU	4 GB	$24/mo	2–3 (3B)	Easiest
Vultr Cloud	2 vCPU	2 GB	$12/mo	—	Medium

*TPS = tokens/second on Qwen 2.5 Q4_K_M, CPU inference, 2,048 token context

🔓

Read the full guide — free

Enter your email to unlock this guide and all future ones. No spam, one click to unsubscribe.

Free forever. No credit card. Unsubscribe any time.

Winner by Use Case

Best overall value

Hetzner CX32 — €6.80/mo

8 GB RAM handles Qwen 2.5 7B Q4 comfortably at 8–12 tok/s. The Goldilocks zone: slow enough to be cheap, fast enough for real work. This is where most people should start.

Best for learning

Hetzner CX22 — €5.89/mo

Runs 3B models (Phi-3 Mini, Qwen 2.5 3B) at 2–3 tok/s. Fast enough for a playground. Save €0.91/month vs competitors at this tier.

Avoid

Vultr Cloud — $12/mo

Only 2 GB RAM — too little for even 3B models reliably. Either go cheaper (Hetzner) or more powerful. Nothing in between justifies this price.

Best if you need SLA

DigitalOcean — $24/mo

99.95% uptime SLA vs Hetzner's 99%. Best docs, fastest support. Worth 3× the cost only if downtime has a real dollar cost.

Benchmark Details: Hetzner CX32

Model:    Qwen2.5-7B-Instruct-Q4_K_M
Hardware: Hetzner CX32 (4 vCPU, 8 GB RAM)
Backend:  Ollama 0.3

Output:   8–12 tokens/second
Latency:  80–120ms first token, then streaming
RAM used: ~7.2 GB (leaves 0.8 GB headroom)
CPU load: 85–95% all cores
Context:  2,048 tokens

Compare: RTX 3060 = 40–50 tok/s. M4 Mac Mini (MLX) = 35–42 tok/s. The VPS can't match GPU or Apple Silicon — but at €6.80/month it doesn't need to.

RAM vs Model Size

Model	Q4_K_M RAM	Q8 RAM	Min VPS
3B (Phi, Qwen)	2.5 GB	3.5 GB	CX22 (€5.89)
7B (Qwen, Llama)	6–7 GB	8–10 GB	CX32 (€6.80)
13B (Mistral)	10–12 GB	14–16 GB	CX42 (€16.40)
30B	22–24 GB	32–40 GB	€32+ plan
70B	48 GB+	70 GB+	GPU required

Quick Setup: CX32 in 15 minutes

# 1. SSH into your VPS
ssh root@your.vps.ip

# 2. Install Ollama
curl https://ollama.ai/install.sh | sh
systemctl enable --now ollama

# 3. Pull the model (~4 GB, takes 2–3 min)
ollama pull qwen2.5:7b-instruct-q4_k_m

# 4. Test it
curl http://localhost:11434/api/generate -d '{
  "model": "qwen2.5:7b-instruct-q4_k_m",
  "prompt": "Explain Docker in one sentence.",
  "stream": false
}'

Annual Cost of Ownership

Platform	1 Year	3 Years	5 Years
Hetzner CX22	€70.68	€212	€353
Hetzner CX32	€81.60	€244	€408
DigitalOcean	$288	$864	$1,440
Mac Mini M4	$605	$617	$629

Mac Mini wins at year 5+. Hetzner wins for anything under ~7 years. DigitalOcean only makes sense with uptime SLA requirements.