Deployment
Deploy Ollama with Docker
Run Ollama in a GPU-enabled container with persistent model storage and REST access.
OllamaDockerself-hostinglocal LLMGPU
Deploy Ollama with Docker
# 1. Create a model cache directory
mkdir -p ~/.ollama
# 2. Run the container with GPU support
docker run -d \
--name ollama \
--gpus all \
-p 11434:11434 \
-v ~/.ollama:/root/.ollama \
ollama/ollama
# 3. Pull a model
export OLLAMA_HOST=http://localhost:11434
ollama pull llama3.1:8b
# 4. Test
ollama run llama3.1:8b
For CPU-only hosts, drop --gpus all. Add restart policy and health checks for production.