Affiliate links on this page — rates explained
Best global coverage

Vultr

32 global locations. $100 trial credit. GPU options built in.

Starting at$12/mo
Locations32 locations across 6 continents
SLA100% network SLA
🎁 $100 credit
Get started on Vultr → See current pricing ↗

We earn $35 per new signup (user gets $100 credit) if you sign up.

Prices shown are community-verified as of April 2026. Click the provider link above to confirm current rates — pricing changes without notice.

PlanRAMCPU / GPUPriceOllama use
Cloud Compute 2GB 2GB 1 vCPU $12 3B models only Get plan →
Cloud Compute 4GB Recommended 4GB 2 vCPU $24 7B models (Q4_K_M) Get plan →
Cloud Compute 8GB 8GB 4 vCPU $48 14B models (Q4) Get plan →
High Performance 16GB 16GB 6 vCPU $80 34B models (Q4) Get plan →
Cloud GPU A16 (16GB VRAM) 16GB VRAM 8 vCPU ~$90/mo 7B FP16 GPU-accelerated Get plan →

Community benchmarks

Measured by community members running real workloads. Numbers are tokens/second on the listed model.

20 tok/s
Qwen 2.5 7B (Q4_K_M)
Cloud Compute 4GB
NVMe storage, fast model loading.
26 tok/s
Llama 3.2 8B (Q4_K_M)
Cloud Compute 8GB
Good CPU throughput.
45 tok/s
Llama 3.3 70B (Q4_K_M)
Cloud GPU A16 (16GB)
GPU changes everything at 70B.

Detailed pros & cons

What's good

✓ 32 locations — most of any provider

Need servers in Tokyo, São Paulo, Seoul, or Johannesburg? Only Vultr covers these regions. Hetzner and DigitalOcean don't.

✓ $35 clean cash affiliate payout

Hetzner pays credits; Vultr pays $35 real cash per referral. New users get $100 in trial credit.

✓ GPU instances in the same dashboard

NVIDIA A16 and A40 GPUs available without switching providers. Not the cheapest GPU option, but the simplest to manage.

✓ Bare metal options

Full dedicated servers from ~$120/mo for workloads that need consistent CPU performance without hypervisor overhead.

Watch out for

✗ Pricier than Hetzner for EU/US users

4GB plan at $24 vs Hetzner €5.89. If you don't need a specific region, Hetzner is significantly cheaper.

✗ GPU instances more expensive than RunPod

Vultr's A16 at ~$90/mo fixed cost vs RunPod's on-demand ~$0.34/hr. For intermittent GPU use, RunPod is cheaper.

Best for

Users in regions not served by Hetzner (Asia-Pacific, South America, Africa). Also good for light GPU workloads without switching platforms.

Not for

Pure budget builds (Hetzner) or heavy/on-demand GPU inference (RunPod).

Ollama setup guide for Vultr

Step-by-step from account creation to first model response. Takes 15–20 minutes.

1

Create your account

Sign up via our link for $100 free credit. Vultr credits don't expire — unlike DigitalOcean's 60-day limit.
2

Choose your location

Vultr has 32 locations including Tokyo, São Paulo, Seoul, Johannesburg, and Melbourne. Pick the one closest to your primary users.
3

Deploy a Cloud Compute instance

Cloud Compute → Ubuntu 22.04 LTS → 4GB plan. Enable auto-backups if running production workloads (+20% cost, worth it).
4

Connect via SSH

Vultr emails you the root password. SSH in:
ssh root@YOUR_IP

Change the password immediately:
passwd
5

Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

Verify: systemctl status ollama
Should show 'active (running)'.
6

Configure the Vultr firewall

Vultr Console → Firewall → Add Firewall Group → Allow TCP 11434 from your IP → Assign to instance.
This is separate from OS-level iptables.
7

Pull models and test

ollama pull qwen2.5:7b-instruct-q4_K_M
ollama run qwen2.5:7b-instruct-q4_K_M

Test with a message. For GPU instances: GPU is auto-detected, no drivers needed.
8

Optional: GPU instance

For GPU acceleration: deploy Cloud GPU → select NVIDIA A16 → use the Ollama One-Click App template. Model pull works identically, inference is 5–10x faster.

Community setups

Tokyo 4GB + Qwen 2.5 7B
20 tok/s
"Only provider with good Tokyo latency for my Japan-based users. 30ms from Tokyo. Worth the premium."
@asia_dev

Ready to get started?

New accounts get $100 credit. Enough to test your full setup risk-free.

Get started on Vultr →