Vultr
32 global locations. $100 trial credit. GPU options built in.
We earn $35 per new signup (user gets $100 credit) if you sign up.
Plans & pricing
Verify current prices on Vultr ↗Prices shown are community-verified as of April 2026. Click the provider link above to confirm current rates — pricing changes without notice.
| Plan | RAM | CPU / GPU | Price | Ollama use | |
|---|---|---|---|---|---|
| Cloud Compute 2GB | 2GB | 1 vCPU | $12 | 3B models only | Get plan → |
| Cloud Compute 4GB Recommended | 4GB | 2 vCPU | $24 | 7B models (Q4_K_M) | Get plan → |
| Cloud Compute 8GB | 8GB | 4 vCPU | $48 | 14B models (Q4) | Get plan → |
| High Performance 16GB | 16GB | 6 vCPU | $80 | 34B models (Q4) | Get plan → |
| Cloud GPU A16 (16GB VRAM) | 16GB VRAM | 8 vCPU | ~$90/mo | 7B FP16 GPU-accelerated | Get plan → |
Community benchmarks
Measured by community members running real workloads. Numbers are tokens/second on the listed model.
Detailed pros & cons
What's good
Need servers in Tokyo, São Paulo, Seoul, or Johannesburg? Only Vultr covers these regions. Hetzner and DigitalOcean don't.
Hetzner pays credits; Vultr pays $35 real cash per referral. New users get $100 in trial credit.
NVIDIA A16 and A40 GPUs available without switching providers. Not the cheapest GPU option, but the simplest to manage.
Full dedicated servers from ~$120/mo for workloads that need consistent CPU performance without hypervisor overhead.
Watch out for
4GB plan at $24 vs Hetzner €5.89. If you don't need a specific region, Hetzner is significantly cheaper.
Vultr's A16 at ~$90/mo fixed cost vs RunPod's on-demand ~$0.34/hr. For intermittent GPU use, RunPod is cheaper.
Users in regions not served by Hetzner (Asia-Pacific, South America, Africa). Also good for light GPU workloads without switching platforms.
Pure budget builds (Hetzner) or heavy/on-demand GPU inference (RunPod).
Ollama setup guide for Vultr
Step-by-step from account creation to first model response. Takes 15–20 minutes.
Create your account
Sign up via our link for $100 free credit. Vultr credits don't expire — unlike DigitalOcean's 60-day limit.
Choose your location
Vultr has 32 locations including Tokyo, São Paulo, Seoul, Johannesburg, and Melbourne. Pick the one closest to your primary users.
Deploy a Cloud Compute instance
Cloud Compute → Ubuntu 22.04 LTS → 4GB plan. Enable auto-backups if running production workloads (+20% cost, worth it).
Connect via SSH
Vultr emails you the root password. SSH in: ssh root@YOUR_IP Change the password immediately: passwd
Install Ollama
curl -fsSL https://ollama.com/install.sh | sh Verify: systemctl status ollama Should show 'active (running)'.
Configure the Vultr firewall
Vultr Console → Firewall → Add Firewall Group → Allow TCP 11434 from your IP → Assign to instance. This is separate from OS-level iptables.
Pull models and test
ollama pull qwen2.5:7b-instruct-q4_K_M ollama run qwen2.5:7b-instruct-q4_K_M Test with a message. For GPU instances: GPU is auto-detected, no drivers needed.
Optional: GPU instance
For GPU acceleration: deploy Cloud GPU → select NVIDIA A16 → use the Ollama One-Click App template. Model pull works identically, inference is 5–10x faster.
Community setups
"Only provider with good Tokyo latency for my Japan-based users. 30ms from Tokyo. Worth the premium."
Ready to get started?
New accounts get $100 credit. Enough to test your full setup risk-free.