oMNI Chat — Local-first AI chat for open models

Why oMNI Chat

Open models are ready
Local and open-weight models are already usable for drafting, coding, analysis, and everyday conversation.
Streamlined, not stripped down
oMNI Chat removes the friction of self-hosting — one interface for models scattered across your machines.
Any OpenAI-compatible backend
Works with Ollama, LM Studio, oMLX, llama.cpp, vLLM, LiteLLM, and any provider that speaks the OpenAI API.

Built for how you actually run AI

Local-first, privacy-aware

Your data stays under your control. No cloud-first lock-in, no unnecessary exposure.

Your AI, your endpoint

Connect OpenAI-compatible endpoints and choose the model backend that fits your setup.

Many backends, one dropdown

Run different models on a few computers and see them all in the same model picker.

Everything else you need

Organize, route, search, and share — without leaving your own infrastructure.

Routing

Auto model router

Heuristics — and optionally a lightweight classifier model — pick the best backend for each chat turn.

Sharing

Multi-user support

Share local models with friends and family on a single instance you host.

Organization

Tags & projects

Organize chat history with tags and projects. Jump back to recent conversations through chronological groupings.

Workflow

Profiles

Four built-in profiles for common use cases — brainstorm, draft, analyze, code — plus custom profiles via configuration.

Workflow

Context management

Delete unwanted chat blocks, inline-edit a response to ask for changes, and keep prompts focused.

Capabilities

Built-in web search

Launch with the SearXNG sidecar container and give tool-capable models access to fresh web results.

Capabilities

Live code previews

Rich functional previews for web apps and SVG art generated in chat — sandboxed, interactive, and immediate.

Deploy

Native or Docker

Run from source with Go and Node, or spin up with Docker Compose in a few commands.

…and so much more. Read the full README

Works with your stack

Ollama
LM Studio
oMLX
llama.cpp
vLLM
LiteLLM
+ any OpenAI-compatible API

Get running in minutes

Docker Compose is the fastest path. Clone, configure, and visit localhost.

Standard setup

git clone https://gitlab.com/geekaholic/omni-chat.git
cd omni-chat
cat .env.example | tee -a .env

# use --pull always to get latest upstream image
docker compose up --pull always -d

With web search

# one time, set a SEARXNG_SECRET
SECRET=$(openssl rand -hex 32)
sed -i.bak "s/# SEARXNG_SECRET=/SEARXNG_SECRET=$SECRET/" .env

docker compose -f compose.yaml -f compose.search.yaml up --pull always -d

On Linux, use sed -i instead of sed -i.bak.

Open source on GitLab

oMNI Chat is free to use, inspect, and extend. Star the repo, open issues, or contribute.

gitlab.com/geekaholic/omni-chat