Two backends
LlamaBoss rides on top of Ollama. LlamaBoss Pro talks to llama.cpp directly.
A native Windows chat app for running local models. Multimodal, streaming, agentic — and entirely offline.
LlamaBoss rides on top of Ollama. LlamaBoss Pro talks to llama.cpp directly.
Drop in images and text files. Chat about screenshots or paste right from the clipboard.
Filesystem, shell, and workspace access — with confirmation gates so nothing runs by surprise.
Runs entirely on your machine. No cloud, no telemetry, no account.
Windows installer, MIT licensed, no account needed. Open the installer, click through, and start chatting with your local models.
Coming soon LlamaBoss Pro — direct llama.cpp backend, no Ollama needed, with CUDA auto-detection. Currently in active development.