← All guides
Setup guide 10 min read · Updated April 2026

I Replaced GitHub Copilot with a €6/mo VPS. Here's How.

Qwen 2.5 Coder 7B on Hetzner + Continue.dev in VS Code. Private code completion, no rate limits, zero Microsoft servers. Full step-by-step walkthrough.

GitHub Copilot costs $10–19/month, sends your code to Microsoft, and has rate limits. For €5.89/month on Hetzner CX22, you get an always-on Qwen 2.5 Coder 7B running in VS Code via Continue.dev — private, unlimited, and faster after a 200ms warm-up.

What You'll Build

Your VS Code (Continue.dev)
    ↓ SSH tunnel (port 11434)
Hetzner CX22 — €5.89/mo
    ↓
Ollama + Qwen 2.5 Coder 7B Q4
    ↓
8–12 tok/s, fully local inference

Monthly cost: €5.89 (~$6.40) instead of $10 for Copilot. Break-even: 2–3 months, then pure savings.

🔓

Read the full guide — free

Enter your email to unlock this guide and all future ones. No spam, one click to unsubscribe.

Free forever. No credit card. Unsubscribe any time.

Step 1: Create Hetzner Account (5 min)

  1. Go to hetzner.com/cloud — sign up with email + add payment
  2. Create a project named "AI Coder"
  3. Create a server: Ubuntu 24.04, type CX22 (€5.89/mo), choose nearest datacenter
  4. Generate an SSH key in the Hetzner dashboard, download hetzner-key.pem, save to ~/.ssh/
  5. Note your server IP address

Step 2: SSH In and Update (2 min)

chmod 600 ~/.ssh/hetzner-key.pem
ssh -i ~/.ssh/hetzner-key.pem root@YOUR_IP
apt update && apt upgrade -y

Step 3: Install Ollama (2 min)

curl https://ollama.ai/install.sh | sh
systemctl enable --now ollama

# Verify
curl http://localhost:11434/api/tags
# → {"models":[]}

Step 4: Pull Qwen 2.5 Coder 7B (3 min)

ollama pull qwen2.5:7b-instruct-q4_k_m

# Verify
ollama list
# NAME                              SIZE
# qwen2.5:7b-instruct-q4_k_m       4.9 GB

Step 5: Lock Down the Firewall (2 min)

Never expose Ollama directly to the internet — it has no auth. Use SSH tunnelling instead.

apt install ufw -y
ufw default deny incoming
ufw default allow outgoing
ufw allow 22/tcp    # SSH only — don't skip this!
# Do NOT open 11434 publicly
ufw enable

Step 6: SSH Tunnel from Your Machine (1 min)

On your local machine (not the VPS):

# Forward localhost:11434 → VPS:11434
ssh -i ~/.ssh/hetzner-key.pem -N -L 11434:localhost:11434 root@YOUR_IP &

# Test it works locally
curl http://localhost:11434/api/tags   # Should return model list

To make the tunnel persistent, add this to ~/.ssh/config:

Host hetzner-ollama
    HostName YOUR_IP
    User root
    IdentityFile ~/.ssh/hetzner-key.pem
    LocalForward 11434 localhost:11434
    ServerAliveInterval 60

Then just run: ssh -N hetzner-ollama &

Step 7: Install Continue.dev in VS Code

  1. Extensions panel → search "Continue"
  2. Install Continue: Code Autocompletion & Chat
  3. Reload VS Code

Step 8: Configure Continue.dev

Cmd+Shift+P → "Continue: Open Config" — replace with:

{
  "models": [
    {
      "title": "Qwen 2.5 Coder 7B (Local)",
      "provider": "ollama",
      "model": "qwen2.5:7b-instruct-q4_k_m",
      "apiBase": "http://localhost:11434"
    }
  ],
  "tabAutocompleteModel": {
    "title": "Qwen 2.5 Coder 7B",
    "provider": "ollama",
    "model": "qwen2.5:7b-instruct-q4_k_m",
    "apiBase": "http://localhost:11434",
    "completionOptions": {
      "maxTokens": 32,
      "temperature": 0.1
    }
  }
}

Step 9: Test It

  1. Open any code file in VS Code
  2. Type def fibonacci( and press Tab or wait 1 second
  3. Continue will stream a completion from your Hetzner VPS
  4. Expected: first token in 200–300ms, then 8–12 tok/s streaming

Cost Comparison

ToolMonthlyAnnualPrivacy
GitHub Copilot$10$120Code sent to Microsoft
Cursor Pro$20$240Code sent to Cursor
Hetzner + Qwen€5.89€70.68Stays on your server

Troubleshooting

"Connection refused on port 11434" — Check the tunnel is running: ps aux | grep ssh. Restart with: ssh -N hetzner-ollama &

"Out of memory" errors — Upgrade to CX32 (€6.80, 8 GB RAM) or switch to the 3B model: ollama pull qwen2.5:3b-instruct-q4_k_m

"Completions very slow (>5s)" — Run top on the VPS. If CPU is below 80%, check network latency: ping YOUR_IP. Should be under 50ms.