LLM på gaming-PC hjemme

Denne siden gir konkrete anbefalinger for lokale LLM-er per vanlig RTX-GPU. Målet er å hjelpe deg med å kjøre praktiske modeller på forbrukerhardware med forutsigbar ytelse.

Anbefalt lokal stack (enkel og pålitelig)

  1. Runtime: Ollama for quick setup, or LM Studio for GUI workflows.
  2. Inference backend: llama.cpp / GGUF for easiest compatibility.
  3. Quantization: start with Q4_K_M, move to Q5_K_M if VRAM allows.
  4. Context: begin at 8k, then increase only when needed.

GPU-til-modell-anbefalinger

Typisk GPU VRAM Beste lokale modellstørrelse Konkrete modeller å starte med Forventet opplevelse
RTX 3050 / 3060 Laptop 4-6 GB 3B-7B (Q4) Llama 3.2 3B Instruct, Phi-3 Mini, Qwen2.5 3B/7B Good for chat, summaries, coding hints. Limited long-context quality.
RTX 3060 12GB / RTX 4060 8GB 8-12 GB 7B-8B (Q4/Q5) Llama 3.1 8B Instruct, Qwen2.5 7B Instruct, Mistral 7B Instruct Strong daily local assistant performance for text and code.
RTX 4060 Ti 16GB / RTX 4070 12GB 12-16 GB 8B-14B (Q4) Qwen2.5 14B Instruct (Q4), DeepSeek Coder V2 Instruct, Llama 3.1 8B (Q5) Noticeably better reasoning/coding while still responsive.
RTX 4070 Ti Super / 4080 16 GB 14B-32B (Q4, selective) Qwen2.5 14B/32B, DeepSeek R1 Distill (32B), Mixtral 8x7B (heavier) High-quality local output for serious coding and analysis tasks.
RTX 5070 / 5070 Ti 12-16 GB (varies by model) 14B-32B (Q4) Qwen2.5 14B/32B, DeepSeek R1 Distill (32B), Mixtral 8x7B Excellent modern sweet spot for gaming plus serious local coding/reasoning.
RTX 5080 16 GB (typical) 32B class (Q4/Q5) Qwen2.5 32B Instruct, DeepSeek R1 Distill (32B), DeepSeek Coder V2 Instruct Very strong single-GPU local quality while remaining gaming focused.
RTX 4090 24 GB 32B class (Q4/Q5), some 70B via aggressive quantization Qwen2.5 32B Instruct, DeepSeek V3, Llama 3.1 70B (very quantized) Best single-GPU consumer setup for local quality today.
RTX 5090 24 GB+ (flagship tier) 32B class comfortably, 70B quantized more practical Qwen2.5 32B Instruct, Llama 3.1 70B Instruct, DeepSeek V3 Top consumer option for local LLM throughput and headroom.

Direct links open model pages on Hugging Face where you can download weights or see compatible runtimes.

Konkrete rigg-anbefalinger

Balansert standardrigg

GPU: RTX 4070 Super / 4070 Ti Super

RAM: 64 GB DDR5

Storage: 2 TB NVMe SSD

CPU: Ryzen 7 7800X3D or Core i7 class

Why: Great gaming + strong 8B/14B local LLM experience.

Kraftig lokal LLM-rigg

GPU: RTX 4090 24GB

RAM: 96-128 GB

Storage: 2-4 TB NVMe SSD

CPU: Ryzen 9 / Core i9 class

Why: Best single-GPU path for 32B local models with usable speed.

Rask kjøpsregel

If your goal is mostly gaming and some local LLM work, target at least 12 GB VRAM. If your goal is serious local reasoning/coding quality, aim for 16-24 GB VRAM.