ai/legion

No description

Go 95.5%
PLpgSQL 3.6%
Shell 0.9%

Find a file

Wesley 8625cd4dbe All checks were successful deploy / changes (push) Successful in 12s Details deploy / verify (push) Successful in 37s Details deploy / build / legion-gateway (legion-gateway) (push) Successful in 22s Details deploy / build / legion-interface-telegram (legion-interface-telegram) (push) Successful in 23s Details deploy / build / legion-log (legion-log) (push) Successful in 23s Details deploy / build / legion-orchestrator (legion-orchestrator) (push) Successful in 26s Details deploy / build / legion-node (legion-node) (push) Successful in 27s Details deploy / build / legion-providers (legion-providers) (push) Successful in 26s Details deploy / build / legion-seed-tools (legion-seed-tools) (push) Successful in 28s Details deploy / build / legion-tools (legion-tools) (push) Successful in 29s Details deploy / deploy (push) Successful in 2s Details kernel: tool_read + tool_edit + KindToolEdit diff events Closes the rewrite-from-scratch loop on tool iteration. Agents can now fetch a tool's source, edit a window of it, and have the diff show up in the TUI as a git-style change block. - tool_read: spec + recent_edits + source (cat -n style numbered), optional start_line/end_line for windowed reads. Returns total_lines + content_hash so the agent can drive iterative reads + the safety check on line-based edits. - tool_edit: three modes the agent picks via args — STRING ({old_string, new_string}) — surgical patch by unique substring; inserts use an anchor line in old_string. RANGE ({start_line, end_line, new_text, expected_content_hash}) — replace or (new_text="") delete a contiguous chunk. INSERT ({insert_at_line, new_text, expected_content_hash}) — splice before a line; total_lines+1 appends at EOF. expected_content_hash is required in line modes so a drifted source is rejected rather than spliced into the wrong lines. - tool_write now fetches prior source on update so the same edit event fires for full rewrites too. First writes (no prior) skip the diff event — the build_result is enough. - KindToolEdit (new event) carries name, before_hash, after_hash, unified_diff, added, removed, edit_summary. internal/diff wraps go-udiff for the unified output; promoted to direct dep. - System prompt teaches the read-before-edit discipline + when to reach for each mode + how to strip the line-number prefix from tool_read output before splicing.		2026-05-18 08:32:15 +00:00
.forgejo/workflows	interface: telegram — webhook-driven external transport for the fleet	2026-05-17 17:25:54 +00:00
cmd	starter: home_assistant tool — list/turn on/off/set brightness for HA lights	2026-05-17 19:47:41 +00:00
deploy	docs: home_assistant starter + the legion-tool-ha-cfg Secret pattern	2026-05-17 20:37:32 +00:00
docs	docs: sync to the stateless-daemon reality	2026-05-16 18:31:27 +00:00
internal	kernel: tool_read + tool_edit + KindToolEdit diff events	2026-05-18 08:32:15 +00:00
pkg	repo: split legion-cli out (#45 )	2026-05-17 11:09:56 +00:00
.gitignore
CLAUDE.md	docs: home_assistant starter + the legion-tool-ha-cfg Secret pattern	2026-05-17 20:37:32 +00:00
go.mod	kernel: tool_read + tool_edit + KindToolEdit diff events	2026-05-18 08:32:15 +00:00
go.sum	kernel: tool_read + tool_edit + KindToolEdit diff events	2026-05-18 08:32:15 +00:00
README.md	docs: home_assistant starter + the legion-tool-ha-cfg Secret pattern	2026-05-17 20:37:32 +00:00
SECURITY.md	docs: home_assistant starter + the legion-tool-ha-cfg Secret pattern	2026-05-17 20:37:32 +00:00

README.md

Legion

A Go daemon that hosts many autonomous AGI-style agents. Each agent can write its own prompts, memory, Go tools (compiled to per-tool binaries), schedules, watchers, task boards, and sub-agents.

What works today

Daemon (legion-node) runs as a systemd service. Boots agentless and stays healthy when no agents are configured. Restores persisted agents on startup.
CLI / TUI (legion) — Bubble Tea dashboard + per-agent attach view. No args opens the TUI. Subcommands: info, ps, spawn, tell, attach, pause, resume, stop. Dashboard flexes to terminal width with columns for state, live activity verb (thinking / reasoning / → tool_name / idle), resolved model id, heartbeat, live task counts (doing/open/blocked), cron count, last-activity, parent.
Attach-view reasoning panel — when the model is mid-thought, its chain-of-thought streams into a pinned panel above the task list (header ◆ reasoning… (length: 1.2k), body indented under the ⎿ tree connector). Panel grows with the reasoning, capped only by available terminal rows. On stream end the panel commits to the transcript as one markdown-rendered block and clears. Toggle visibility with ^r. Works for both inline model reasoning (GLM, gpt-oss, R1 distills) and the reflect kernel tool's internal LLM call.
Gateway pod (planned) — cmd/legion-gateway will front external traffic in k8s: WebSocket server at wss://legion.edge.capetown, bearer-token auth, brokers NATS pubs/subs on the client's behalf. See docs/nats-transport.md for the full design.
Reasoning loop — full LLM-driven session runner with streaming, parallel tool fan-out, multi-turn iteration, MaxTurns guard (configurable per agent), per-turn + cumulative token tracking. The loop also injects a session-close nudge: if the model is about to stop while owning a task in doing, one synthetic turn forces it to either task_complete or move the task back before the final reply is shown to the user.
Lean tool catalog + global registry — each agent's prompt sees only kernel tools (memory_, task_, cron_, watch_, agent_, reflect, llm_, heartbeat_*) plus two meta-tools (tool_search, tool_invoke). Everything else — http_get, web_fetch, web_search, now, and every tool any agent has authored — lives in a single canonical registry at ~/.legion/tools/<name>/ and stays out of the prompt until an agent explicitly searches for it and invokes it by name. Result: small prompt budget, zero version drift, agents discover capabilities on demand. Tool metadata records author_agent, maintained_by, content_hash, keywords[], recent_edits[] — full design in docs/tool-registry.md.
task_query analytics with multi-preset bundling: pass presets:["stuck_in_doing","overdue","upcoming"] to run all three in one call. Presets: blocked, stuck_in_doing, overdue, upcoming, with_notes_not_done, orphans, longest_chain, all.
Instructive errors throughout — wrong task id lists active candidates, missing required args enumerate valid values, unknown tool names point at tool_search, etc. Designed so small models self-correct on the next turn.
LLM providers — Anthropic (streaming + tool use with 408/425/429/5xx retry + backoff), Ollama (local, /api/chat), OpenAI-compatible (LM Studio / vLLM / OpenAI / Together / any endpoint with tool_calls streaming — incl. Huawei ModelArts MaaS via configurable chat_path for non-/v1/ deployments), mock (offline tests). Generic loop in the daemon picks up any [providers.<name>] block in config.toml so adding a new hosted endpoint is a config edit, no code change. Reasoning-model streams (reasoning_content deltas) are surfaced as separate events so the TUI's reasoning panel can render them live. Per-provider circuit breaker on the OpenAI-compat path: 5s dial timeout, opens after 3 consecutive dial failures with exponential backoff capped at 5 minutes — a dead LM Studio fails in microseconds instead of burning 30s per heartbeat. Aliases (smart, fast, cheap, local, etc.) configurable; the dashboard resolves and displays the concrete model id, not the alias.
State machine — running ↔ paused ↔ stopped → terminated. All transitions persist to config.json and survive daemon restarts. stopped agents fully suspend their cron scheduler + watcher goroutines so they stop polling external endpoints, not just stop firing LLM sessions. paused keeps watchers polling so external state accumulates in the inbox for replay on resume.
Inflight session cap — automatic triggers (heartbeat / cron / watcher) past max_concurrent_sessions (default 3) are dropped with a warn-level log event. Prevents inference overruns when a session's wall-clock exceeds the heartbeat interval and would otherwise stack inferences unboundedly. Message triggers bypass the cap by design — the user is the back-pressure for those.
Task board — per-agent durable kanban + burndown with cross-agent assignment via Valkey streams. Tasks have status (open / doing / blocked / done / cancelled), free-form lanes, recursive DAG dependencies (blocked_by, including <id>@<agent> cross-agent refs), notes timeline, due dates. Status validated server-side against the allowed set so a small model can't corrupt the board with invented values.
Cron scheduler — robfig/cron/v3 with second-precision; entries persist to crons/crons.json and reload on daemon boot. Suspended when the owning agent is stopped.
Watchers — passive monitors that wake the agent only on triggers, so an agent can observe external state without burning LLM tokens. Built-in kinds: http_poll, file_watch, valkey_watch, cmd_watch, process_watch, tcp_probe, plus custom Go watchers compiled via pkg/watchio. Trigger DSL evaluates inside the watcher (no LLM cost per poll); actions are wake / tool:<name> / message:<agent> / task:<lane> / silent. Suspended when the owning agent is stopped.
Self-reflection — reflect({scope, focus?}) runs a templated LLM critique over recent actions/tools/sessions and writes structured outputs back into memory (lessons), tool stats (verdicts), task board (followup tasks), and prompt-update proposals.
Memory — Postgres-backed (legion.agent_memory) keyed on (agent_id, path). Survives daemon restarts and follows the agent if the orchestrator reassigns it to a different pod. memory_search defaults to literal substring; semantic: true adds cosine-similarity over a pgvector(768) column and returns a hybrid result that includes literal hits for paths not yet embedded (so freshly-written notes aren't invisible during backfill).
Semantic memory — per-daemon embedworker polls agent_memory for rows with NULL embeddings and batches them against an OpenAI-compatible /v1/embeddings endpoint (LM Studio's text-embedding-nomic-embed-text-v1.5 is the default self-host target). LEGION_EMBED_URL / LEGION_EMBED_MODEL / LEGION_EMBED_API_KEY env-driven; nil client is a first-class state (semantic search degrades to literal with a note in the result).
Zero-downtime inbox (JetStream-backed) — both peer-agent agent_message and operator agents.tell land on legion.msg.<id>, captured by the LEGION_MSG JetStream stream (file-backed, MaxAge=2h). Each agent has a durable pull consumer (agent-<id>) the daemon attaches to on adopt and drains on revoke without deleting. Messages sent during a pod handoff, failover, deploy, or while the recipient is briefly paused stay queued in the stream until the new owning pod's first Fetch drains them in order. NakWithDelay (60s) + MaxDeliver (30) gives a ~30 minute grace window for a paused operator action before the broker terminates. System prompt teaches a REPLY-or-IDLE protocol so small models don't enter politeness loops; daemon-side guard drops IDLE-bodied messages even when the prompt fails.
Tool quality stats — per-tool rolling latency histograms (p50, p95, p99, max), failure-rate tracking, verdict (healthy / watch / degraded), tied into reflect so agents can audit their own tools.
Event bus — every action the daemon takes (thoughts, tool calls with args + results, memory ops, state changes, build outcomes, token usage, task changes, watcher fires, cron fires) flows through a fan-out bus. Persisted to agents/<id>/logs/events.jsonl for forensics + replayed to new attaches from an in-memory ring buffer so the TUI doesn't open blank. Events carry id + caused_by for the timeline view's causal-tree rendering.
TUI attach view — single chronological transcript: streamed prose interleaved with timestamped tool calls, memory ops, and lifecycle markers. Word-wrap, inline markdown rendering (**bold**, `code`, headers), syntax-highlighted Go for tool_write.source (chroma / monokai). Tab toggles focus between input and transcript; ^c interrupts the in-flight session; ^g toggles transcript ↔ timeline. Mouse capture is OFF so native click-drag-to-copy works.
Agent lifecycle — agent.spawn with ttl + terminate_on; agent.destroy archives to ~/.legion/graveyard/<id>-<ts>/ or removes outright; agent.self_destruct posts a child_terminated event back to the parent.
Starter tool library — http_get, web_fetch, web_search (Brave Search API backend), now, dag_analyze, and home_assistant live in the global tool registry from daemon start. They're NOT pre-loaded into any agent's prompt — agents find them via tool_search when needed. http_get + web_fetch send Chrome-like headers by default to avoid bot-block pages on auto/retail/Cloudflare-fronted sites. home_assistant controls Home Assistant lights (list / on / off / set brightness) over the REST API; reflexive operation names (list_entities, get_state, dim) alias to the canonical four so small models don't loop. Starter sources are content-hashed; an embedded-source change triggers an in-place rebuild of the registry binary on next daemon boot.
Security audit (partial) — agent-id charset validation, scrubbed env on tool subprocesses (no daemon credentials reachable from a compiled tool by default; explicit allowlist for BRAVE_API_KEY used by web_search and HA_URL + HA_TOKEN used by home_assistant), GOPROXY=off so agent tools can't pull remote modules, per-host max-agents cap, Unix socket perms, Anthropic key redaction on every error path, instructive error paths to teach small models the rules they keep breaking.

What's planned but not yet built

Probe-side deprecation extraction — legion.model_deprecations is operator-driven today (via providers.deprecations.set); a follow-up would parse upstream /v1/models responses to fill the table automatically where the API exposes deprecation metadata (OpenAI, Anthropic newer responses).
CLI repo split — done. cmd/legion, pkg/legionapi, and internal/tui moved to legion-cli. WS wire-format types (Inbound/Outbound/method names) are duplicated in both repos (internal/gateway/protocol.go here, pkg/legionapi/protocol.go there) — keep byte-compatible.
Inter-agent pub/sub — agent_subscribe(topic) / agent_publish(topic) for named-channel broadcast, complementing the direct agent_message(to, text) that ships today. Useful for fanout patterns (a coordinator announces, N workers subscribe).
Multi-replica legion-node — the orchestrator + stateless-daemon story is done end-to-end (stranded-pod reassign via #54 is the last gap that's now closed). Operationally still running 1 replica; the bump is just a kustomize edit + verification under real load.
Linux namespaces / seccomp sandboxing for compiled tool subprocesses (v1 is in-code path checks only).
Encrypted memory at rest (pgcrypto is already in the cluster; the agent-memory rows are not yet column-encrypted).

Full architecture and design rationale: /home/wesley/.claude/plans/i-d-like-to-build-mutable-pearl.md. Security model: SECURITY.md.

Build

go build -o bin/legion-node ./cmd/legion-node
# The `legion` CLI + TUI lives in its own repo now:
# https://git.edge.capetown/ai/legion-cli

Requires Go 1.25+.

Deploy

Target deployment is Kubernetes. See docs/nats-transport.md for the full architecture (NATS as IPC, gateway pod for external WebSocket traffic, fully-stateless legion-node Deployment with all agent state in Postgres + tool binaries in MinIO).

Run

legion                          # TUI dashboard
legion ps                       # list agents
legion spawn alice              # create + start an agent
legion tell alice "hello"       # send a message
legion attach alice             # tail the event stream
legion pause alice              # suspend heartbeat (state persists)
legion resume alice
legion stop alice               # stop crons + watchers too

Configuration

~/.legion/config.toml (gitignored, outside the repo):

[daemon]
socket   = "~/.legion/daemon.sock"
data_dir = "~/.legion"
max_agents_per_host = 64

[providers.anthropic]
api_key     = "${ANTHROPIC_API_KEY}"
concurrency = 16

[providers.ollama]
host = "${OLLAMA_HOST:-http://127.0.0.1:11434}"

[providers.lmstudio]
host        = "http://192.168.2.100:1234"
concurrency = 8

[aliases]
smart = "claude-opus-4-7"
fast  = "claude-haiku-4-5"
cheap = "llama3.1:latest"
local = "qwen/qwen2.5-coder-14b"

# Brave Search API key for the web_search starter tool.
# The daemon plumbs this through to compiled tools via BRAVE_API_KEY.
[search.brave]
api_key = "${BRAVE_API_KEY:-}"

For Home Assistant + cluster deployments, the operator wires credentials through K8s Secrets that legion-node references with envFrom:

legion-search-cfg — BRAVE_API_KEY for web_search.
legion-tool-ha-cfg — HA_URL (e.g. http://home-assistant.home-assistant:8123 for in-cluster HA) and HA_TOKEN (a long-lived access token) for home_assistant.

Both Secrets are marked optional on the Deployment; missing entries make the corresponding starter tool return an instructive error instead of silently misbehaving.

${VAR} and ${VAR:-default} expand from the environment at load time. A fresh install Just Works with defaults — aliases auto-remap to mock when their configured target is unreachable.

Runtime requirements

Local dev / single-binary run:

Go 1.25+ on $PATH (the toolhub still shells out to go build to compile agent-written tools).
Optional: Anthropic API key for the highest-quality model tier.
Optional: a local LM Studio / Ollama server for self-hosted models — any OpenAI-compatible /v1/chat/completions endpoint with tool_calls streaming works via the openai_compat provider.
Optional: Brave Search API key for the web_search starter tool (free tier is generous).
Optional: Home Assistant URL + long-lived access token for the home_assistant starter tool (control lights from agents).

Cluster deployment (see docs/nats-transport.md):

Postgres (CNPG) — agents, memory, tasks, crons, watchers, tool registry metadata, provider catalog.
NATS — transport plane (legion.evt.>, legion.cmd.>, legion.fleet.>, legion.tools.>).
MinIO (single-replica legion-storage) — content-addressed tool binaries; 5-minute presigned URLs for daemon fetches.

Repository layout

cmd/
  legion-node/   daemon entrypoint
  legion/    CLI + TUI client
internal/
  agent/        per-agent supervisor + state machine + inflight session cap
  agentrepo/    Postgres-backed agent registry (legion.agents)
  config/       TOML loader with env-var expansion
  crons/        per-agent cron scheduler with suspend/resume (Postgres-backed)
  events/       event bus + per-agent ring buffer + NATS bridge
  gateway/      WS server with bearer auth (legion-gateway pod)
  ipc/          NATS-routed control plane + WS shim
  kernel/       lean tool catalog + LLM session runner + reflection
  lifecycle/    spawn / destroy / graveyard
  llm/          provider abstraction (Anthropic, openai-compat, Ollama, mock)
  memory/       Postgres-backed agent memory (legion.agent_memory)
  modelhub/     in-cluster provider catalog client
  orchestrator/ singleton placement + failover authority (legion-orchestrator)
  pgconn/       pgxpool helper with retry semantics
  providers/    NATS-routed provider catalog (legion-providers pod)
  sandbox/      path-escape enforcement
  starter/      embedded starter library (legion-seed-tools one-shot job)
  tasks/        Postgres-backed task board (legion.tasks)
  toolhub/      NATS client for the legion-tools registry
  tools/        in-process builder + invoker + rolling Stats
  tui/          Bubble Tea views (dashboard, attach, timeline, reasoning)
  watchers/     Postgres-backed passive monitors (legion.watchers)
docs/
  tool-registry.md  design + lifecycle for the global tool store
pkg/
  toolio/     shared ABI for agent-written tools
  watchio/    shared ABI for agent-written watchers
deploy/       systemd unit + env template
scripts/      install + update helpers

Tests

go test ./...

Integration smoke tests gated on ANTHROPIC_API_KEY (e.g. TestAnthropicSmoke) skip when the env var is unset, so plain go test ./... stays offline-safe.

# One-shot dashboard render to stderr — useful for eyeballing
# layout after a TUI change.
go test ./internal/tui/ -run TestSnapshot -v