GitHub Copilot costs $10–19/month, sends your code to Microsoft, and has rate limits. For €5.89/month on Hetzner CX22, you get an always-on Qwen 2.5 Coder 7B running in VS Code via Continue.dev — private, unlimited, and faster after a 200ms warm-up.
What You'll Build
Your VS Code (Continue.dev)
↓ SSH tunnel (port 11434)
Hetzner CX22 — €5.89/mo
↓
Ollama + Qwen 2.5 Coder 7B Q4
↓
8–12 tok/s, fully local inference
Monthly cost: €5.89 (~$6.40) instead of $10 for Copilot. Break-even: 2–3 months, then pure savings.
Read the full guide — free
Enter your email to unlock this guide and all future ones. No spam, one click to unsubscribe.
Free forever. No credit card. Unsubscribe any time.
Step 1: Create Hetzner Account (5 min)
- Go to hetzner.com/cloud — sign up with email + add payment
- Create a project named "AI Coder"
- Create a server: Ubuntu 24.04, type CX22 (€5.89/mo), choose nearest datacenter
- Generate an SSH key in the Hetzner dashboard, download
hetzner-key.pem, save to~/.ssh/ - Note your server IP address
Step 2: SSH In and Update (2 min)
chmod 600 ~/.ssh/hetzner-key.pem
ssh -i ~/.ssh/hetzner-key.pem root@YOUR_IP
apt update && apt upgrade -y
Step 3: Install Ollama (2 min)
curl https://ollama.ai/install.sh | sh
systemctl enable --now ollama
# Verify
curl http://localhost:11434/api/tags
# → {"models":[]}
Step 4: Pull Qwen 2.5 Coder 7B (3 min)
ollama pull qwen2.5:7b-instruct-q4_k_m
# Verify
ollama list
# NAME SIZE
# qwen2.5:7b-instruct-q4_k_m 4.9 GB
Step 5: Lock Down the Firewall (2 min)
Never expose Ollama directly to the internet — it has no auth. Use SSH tunnelling instead.
apt install ufw -y
ufw default deny incoming
ufw default allow outgoing
ufw allow 22/tcp # SSH only — don't skip this!
# Do NOT open 11434 publicly
ufw enable
Step 6: SSH Tunnel from Your Machine (1 min)
On your local machine (not the VPS):
# Forward localhost:11434 → VPS:11434
ssh -i ~/.ssh/hetzner-key.pem -N -L 11434:localhost:11434 root@YOUR_IP &
# Test it works locally
curl http://localhost:11434/api/tags # Should return model list
To make the tunnel persistent, add this to ~/.ssh/config:
Host hetzner-ollama
HostName YOUR_IP
User root
IdentityFile ~/.ssh/hetzner-key.pem
LocalForward 11434 localhost:11434
ServerAliveInterval 60
Then just run: ssh -N hetzner-ollama &
Step 7: Install Continue.dev in VS Code
- Extensions panel → search "Continue"
- Install Continue: Code Autocompletion & Chat
- Reload VS Code
Step 8: Configure Continue.dev
Cmd+Shift+P → "Continue: Open Config" — replace with:
{
"models": [
{
"title": "Qwen 2.5 Coder 7B (Local)",
"provider": "ollama",
"model": "qwen2.5:7b-instruct-q4_k_m",
"apiBase": "http://localhost:11434"
}
],
"tabAutocompleteModel": {
"title": "Qwen 2.5 Coder 7B",
"provider": "ollama",
"model": "qwen2.5:7b-instruct-q4_k_m",
"apiBase": "http://localhost:11434",
"completionOptions": {
"maxTokens": 32,
"temperature": 0.1
}
}
}
Step 9: Test It
- Open any code file in VS Code
- Type
def fibonacci(and press Tab or wait 1 second - Continue will stream a completion from your Hetzner VPS
- Expected: first token in 200–300ms, then 8–12 tok/s streaming
Cost Comparison
| Tool | Monthly | Annual | Privacy |
|---|---|---|---|
| GitHub Copilot | $10 | $120 | Code sent to Microsoft |
| Cursor Pro | $20 | $240 | Code sent to Cursor |
| Hetzner + Qwen | €5.89 | €70.68 | Stays on your server |
Troubleshooting
"Connection refused on port 11434" — Check the tunnel is running: ps aux | grep ssh. Restart with: ssh -N hetzner-ollama &
"Out of memory" errors — Upgrade to CX32 (€6.80, 8 GB RAM) or switch to the 3B model: ollama pull qwen2.5:3b-instruct-q4_k_m
"Completions very slow (>5s)" — Run top on the VPS. If CPU is below 80%, check network latency: ping YOUR_IP. Should be under 50ms.