No description
  • Go 95.5%
  • PLpgSQL 3.6%
  • Shell 0.9%
Find a file
Wesley 8625cd4dbe
All checks were successful
deploy / changes (push) Successful in 12s
deploy / verify (push) Successful in 37s
deploy / build / legion-gateway (legion-gateway) (push) Successful in 22s
deploy / build / legion-interface-telegram (legion-interface-telegram) (push) Successful in 23s
deploy / build / legion-log (legion-log) (push) Successful in 23s
deploy / build / legion-orchestrator (legion-orchestrator) (push) Successful in 26s
deploy / build / legion-node (legion-node) (push) Successful in 27s
deploy / build / legion-providers (legion-providers) (push) Successful in 26s
deploy / build / legion-seed-tools (legion-seed-tools) (push) Successful in 28s
deploy / build / legion-tools (legion-tools) (push) Successful in 29s
deploy / deploy (push) Successful in 2s
kernel: tool_read + tool_edit + KindToolEdit diff events
Closes the rewrite-from-scratch loop on tool iteration. Agents can
now fetch a tool's source, edit a window of it, and have the diff
show up in the TUI as a git-style change block.

  - tool_read: spec + recent_edits + source (cat -n style numbered),
    optional start_line/end_line for windowed reads. Returns
    total_lines + content_hash so the agent can drive iterative
    reads + the safety check on line-based edits.

  - tool_edit: three modes the agent picks via args —
      STRING  ({old_string, new_string}) — surgical patch by unique
              substring; inserts use an anchor line in old_string.
      RANGE   ({start_line, end_line, new_text,
                expected_content_hash}) — replace or (new_text="")
              delete a contiguous chunk.
      INSERT  ({insert_at_line, new_text, expected_content_hash}) —
              splice before a line; total_lines+1 appends at EOF.
    expected_content_hash is required in line modes so a drifted
    source is rejected rather than spliced into the wrong lines.

  - tool_write now fetches prior source on update so the same edit
    event fires for full rewrites too. First writes (no prior) skip
    the diff event — the build_result is enough.

  - KindToolEdit (new event) carries name, before_hash, after_hash,
    unified_diff, added, removed, edit_summary. internal/diff wraps
    go-udiff for the unified output; promoted to direct dep.

  - System prompt teaches the read-before-edit discipline + when
    to reach for each mode + how to strip the line-number prefix
    from tool_read output before splicing.
2026-05-18 08:32:15 +00:00
.forgejo/workflows interface: telegram — webhook-driven external transport for the fleet 2026-05-17 17:25:54 +00:00
cmd starter: home_assistant tool — list/turn on/off/set brightness for HA lights 2026-05-17 19:47:41 +00:00
deploy docs: home_assistant starter + the legion-tool-ha-cfg Secret pattern 2026-05-17 20:37:32 +00:00
docs docs: sync to the stateless-daemon reality 2026-05-16 18:31:27 +00:00
internal kernel: tool_read + tool_edit + KindToolEdit diff events 2026-05-18 08:32:15 +00:00
pkg repo: split legion-cli out (#45) 2026-05-17 11:09:56 +00:00
.gitignore
CLAUDE.md docs: home_assistant starter + the legion-tool-ha-cfg Secret pattern 2026-05-17 20:37:32 +00:00
go.mod kernel: tool_read + tool_edit + KindToolEdit diff events 2026-05-18 08:32:15 +00:00
go.sum kernel: tool_read + tool_edit + KindToolEdit diff events 2026-05-18 08:32:15 +00:00
README.md docs: home_assistant starter + the legion-tool-ha-cfg Secret pattern 2026-05-17 20:37:32 +00:00
SECURITY.md docs: home_assistant starter + the legion-tool-ha-cfg Secret pattern 2026-05-17 20:37:32 +00:00

Legion

A Go daemon that hosts many autonomous AGI-style agents. Each agent can write its own prompts, memory, Go tools (compiled to per-tool binaries), schedules, watchers, task boards, and sub-agents.

What works today

  • Daemon (legion-node) runs as a systemd service. Boots agentless and stays healthy when no agents are configured. Restores persisted agents on startup.
  • CLI / TUI (legion) — Bubble Tea dashboard + per-agent attach view. No args opens the TUI. Subcommands: info, ps, spawn, tell, attach, pause, resume, stop. Dashboard flexes to terminal width with columns for state, live activity verb (thinking / reasoning / → tool_name / idle), resolved model id, heartbeat, live task counts (doing/open/blocked), cron count, last-activity, parent.
  • Attach-view reasoning panel — when the model is mid-thought, its chain-of-thought streams into a pinned panel above the task list (header ◆ reasoning… (length: 1.2k), body indented under the tree connector). Panel grows with the reasoning, capped only by available terminal rows. On stream end the panel commits to the transcript as one markdown-rendered block and clears. Toggle visibility with ^r. Works for both inline model reasoning (GLM, gpt-oss, R1 distills) and the reflect kernel tool's internal LLM call.
  • Gateway pod (planned)cmd/legion-gateway will front external traffic in k8s: WebSocket server at wss://legion.edge.capetown, bearer-token auth, brokers NATS pubs/subs on the client's behalf. See docs/nats-transport.md for the full design.
  • Reasoning loop — full LLM-driven session runner with streaming, parallel tool fan-out, multi-turn iteration, MaxTurns guard (configurable per agent), per-turn + cumulative token tracking. The loop also injects a session-close nudge: if the model is about to stop while owning a task in doing, one synthetic turn forces it to either task_complete or move the task back before the final reply is shown to the user.
  • Lean tool catalog + global registry — each agent's prompt sees only kernel tools (memory_, task_, cron_, watch_, agent_, reflect, llm_, heartbeat_*) plus two meta-tools (tool_search, tool_invoke). Everything else — http_get, web_fetch, web_search, now, and every tool any agent has authored — lives in a single canonical registry at ~/.legion/tools/<name>/ and stays out of the prompt until an agent explicitly searches for it and invokes it by name. Result: small prompt budget, zero version drift, agents discover capabilities on demand. Tool metadata records author_agent, maintained_by, content_hash, keywords[], recent_edits[] — full design in docs/tool-registry.md.
  • task_query analytics with multi-preset bundling: pass presets:["stuck_in_doing","overdue","upcoming"] to run all three in one call. Presets: blocked, stuck_in_doing, overdue, upcoming, with_notes_not_done, orphans, longest_chain, all.
  • Instructive errors throughout — wrong task id lists active candidates, missing required args enumerate valid values, unknown tool names point at tool_search, etc. Designed so small models self-correct on the next turn.
  • LLM providers — Anthropic (streaming + tool use with 408/425/429/5xx retry + backoff), Ollama (local, /api/chat), OpenAI-compatible (LM Studio / vLLM / OpenAI / Together / any endpoint with tool_calls streaming — incl. Huawei ModelArts MaaS via configurable chat_path for non-/v1/ deployments), mock (offline tests). Generic loop in the daemon picks up any [providers.<name>] block in config.toml so adding a new hosted endpoint is a config edit, no code change. Reasoning-model streams (reasoning_content deltas) are surfaced as separate events so the TUI's reasoning panel can render them live. Per-provider circuit breaker on the OpenAI-compat path: 5s dial timeout, opens after 3 consecutive dial failures with exponential backoff capped at 5 minutes — a dead LM Studio fails in microseconds instead of burning 30s per heartbeat. Aliases (smart, fast, cheap, local, etc.) configurable; the dashboard resolves and displays the concrete model id, not the alias.
  • State machinerunningpausedstoppedterminated. All transitions persist to config.json and survive daemon restarts. stopped agents fully suspend their cron scheduler + watcher goroutines so they stop polling external endpoints, not just stop firing LLM sessions. paused keeps watchers polling so external state accumulates in the inbox for replay on resume.
  • Inflight session cap — automatic triggers (heartbeat / cron / watcher) past max_concurrent_sessions (default 3) are dropped with a warn-level log event. Prevents inference overruns when a session's wall-clock exceeds the heartbeat interval and would otherwise stack inferences unboundedly. Message triggers bypass the cap by design — the user is the back-pressure for those.
  • Task board — per-agent durable kanban + burndown with cross-agent assignment via Valkey streams. Tasks have status (open / doing / blocked / done / cancelled), free-form lanes, recursive DAG dependencies (blocked_by, including <id>@<agent> cross-agent refs), notes timeline, due dates. Status validated server-side against the allowed set so a small model can't corrupt the board with invented values.
  • Cron schedulerrobfig/cron/v3 with second-precision; entries persist to crons/crons.json and reload on daemon boot. Suspended when the owning agent is stopped.
  • Watchers — passive monitors that wake the agent only on triggers, so an agent can observe external state without burning LLM tokens. Built-in kinds: http_poll, file_watch, valkey_watch, cmd_watch, process_watch, tcp_probe, plus custom Go watchers compiled via pkg/watchio. Trigger DSL evaluates inside the watcher (no LLM cost per poll); actions are wake / tool:<name> / message:<agent> / task:<lane> / silent. Suspended when the owning agent is stopped.
  • Self-reflectionreflect({scope, focus?}) runs a templated LLM critique over recent actions/tools/sessions and writes structured outputs back into memory (lessons), tool stats (verdicts), task board (followup tasks), and prompt-update proposals.
  • Memory — Postgres-backed (legion.agent_memory) keyed on (agent_id, path). Survives daemon restarts and follows the agent if the orchestrator reassigns it to a different pod. memory_search defaults to literal substring; semantic: true adds cosine-similarity over a pgvector(768) column and returns a hybrid result that includes literal hits for paths not yet embedded (so freshly-written notes aren't invisible during backfill).
  • Semantic memory — per-daemon embedworker polls agent_memory for rows with NULL embeddings and batches them against an OpenAI-compatible /v1/embeddings endpoint (LM Studio's text-embedding-nomic-embed-text-v1.5 is the default self-host target). LEGION_EMBED_URL / LEGION_EMBED_MODEL / LEGION_EMBED_API_KEY env-driven; nil client is a first-class state (semantic search degrades to literal with a note in the result).
  • Zero-downtime inbox (JetStream-backed) — both peer-agent agent_message and operator agents.tell land on legion.msg.<id>, captured by the LEGION_MSG JetStream stream (file-backed, MaxAge=2h). Each agent has a durable pull consumer (agent-<id>) the daemon attaches to on adopt and drains on revoke without deleting. Messages sent during a pod handoff, failover, deploy, or while the recipient is briefly paused stay queued in the stream until the new owning pod's first Fetch drains them in order. NakWithDelay (60s) + MaxDeliver (30) gives a ~30 minute grace window for a paused operator action before the broker terminates. System prompt teaches a REPLY-or-IDLE protocol so small models don't enter politeness loops; daemon-side guard drops IDLE-bodied messages even when the prompt fails.
  • Tool quality stats — per-tool rolling latency histograms (p50, p95, p99, max), failure-rate tracking, verdict (healthy / watch / degraded), tied into reflect so agents can audit their own tools.
  • Event bus — every action the daemon takes (thoughts, tool calls with args + results, memory ops, state changes, build outcomes, token usage, task changes, watcher fires, cron fires) flows through a fan-out bus. Persisted to agents/<id>/logs/events.jsonl for forensics + replayed to new attaches from an in-memory ring buffer so the TUI doesn't open blank. Events carry id + caused_by for the timeline view's causal-tree rendering.
  • TUI attach view — single chronological transcript: streamed prose interleaved with timestamped tool calls, memory ops, and lifecycle markers. Word-wrap, inline markdown rendering (**bold**, `code`, headers), syntax-highlighted Go for tool_write.source (chroma / monokai). Tab toggles focus between input and transcript; ^c interrupts the in-flight session; ^g toggles transcript ↔ timeline. Mouse capture is OFF so native click-drag-to-copy works.
  • Agent lifecycleagent.spawn with ttl + terminate_on; agent.destroy archives to ~/.legion/graveyard/<id>-<ts>/ or removes outright; agent.self_destruct posts a child_terminated event back to the parent.
  • Starter tool libraryhttp_get, web_fetch, web_search (Brave Search API backend), now, dag_analyze, and home_assistant live in the global tool registry from daemon start. They're NOT pre-loaded into any agent's prompt — agents find them via tool_search when needed. http_get + web_fetch send Chrome-like headers by default to avoid bot-block pages on auto/retail/Cloudflare-fronted sites. home_assistant controls Home Assistant lights (list / on / off / set brightness) over the REST API; reflexive operation names (list_entities, get_state, dim) alias to the canonical four so small models don't loop. Starter sources are content-hashed; an embedded-source change triggers an in-place rebuild of the registry binary on next daemon boot.
  • Security audit (partial) — agent-id charset validation, scrubbed env on tool subprocesses (no daemon credentials reachable from a compiled tool by default; explicit allowlist for BRAVE_API_KEY used by web_search and HA_URL + HA_TOKEN used by home_assistant), GOPROXY=off so agent tools can't pull remote modules, per-host max-agents cap, Unix socket perms, Anthropic key redaction on every error path, instructive error paths to teach small models the rules they keep breaking.

What's planned but not yet built

  • Probe-side deprecation extractionlegion.model_deprecations is operator-driven today (via providers.deprecations.set); a follow-up would parse upstream /v1/models responses to fill the table automatically where the API exposes deprecation metadata (OpenAI, Anthropic newer responses).
  • CLI repo split — done. cmd/legion, pkg/legionapi, and internal/tui moved to legion-cli. WS wire-format types (Inbound/Outbound/method names) are duplicated in both repos (internal/gateway/protocol.go here, pkg/legionapi/protocol.go there) — keep byte-compatible.
  • Inter-agent pub/subagent_subscribe(topic) / agent_publish(topic) for named-channel broadcast, complementing the direct agent_message(to, text) that ships today. Useful for fanout patterns (a coordinator announces, N workers subscribe).
  • Multi-replica legion-node — the orchestrator + stateless-daemon story is done end-to-end (stranded-pod reassign via #54 is the last gap that's now closed). Operationally still running 1 replica; the bump is just a kustomize edit + verification under real load.
  • Linux namespaces / seccomp sandboxing for compiled tool subprocesses (v1 is in-code path checks only).
  • Encrypted memory at rest (pgcrypto is already in the cluster; the agent-memory rows are not yet column-encrypted).

Full architecture and design rationale: /home/wesley/.claude/plans/i-d-like-to-build-mutable-pearl.md. Security model: SECURITY.md.

Build

go build -o bin/legion-node ./cmd/legion-node
# The `legion` CLI + TUI lives in its own repo now:
# https://git.edge.capetown/ai/legion-cli

Requires Go 1.25+.

Deploy

Target deployment is Kubernetes. See docs/nats-transport.md for the full architecture (NATS as IPC, gateway pod for external WebSocket traffic, fully-stateless legion-node Deployment with all agent state in Postgres + tool binaries in MinIO).

Run

legion                          # TUI dashboard
legion ps                       # list agents
legion spawn alice              # create + start an agent
legion tell alice "hello"       # send a message
legion attach alice             # tail the event stream
legion pause alice              # suspend heartbeat (state persists)
legion resume alice
legion stop alice               # stop crons + watchers too

Configuration

~/.legion/config.toml (gitignored, outside the repo):

[daemon]
socket   = "~/.legion/daemon.sock"
data_dir = "~/.legion"
max_agents_per_host = 64

[providers.anthropic]
api_key     = "${ANTHROPIC_API_KEY}"
concurrency = 16

[providers.ollama]
host = "${OLLAMA_HOST:-http://127.0.0.1:11434}"

[providers.lmstudio]
host        = "http://192.168.2.100:1234"
concurrency = 8

[aliases]
smart = "claude-opus-4-7"
fast  = "claude-haiku-4-5"
cheap = "llama3.1:latest"
local = "qwen/qwen2.5-coder-14b"

# Brave Search API key for the web_search starter tool.
# The daemon plumbs this through to compiled tools via BRAVE_API_KEY.
[search.brave]
api_key = "${BRAVE_API_KEY:-}"

For Home Assistant + cluster deployments, the operator wires credentials through K8s Secrets that legion-node references with envFrom:

  • legion-search-cfgBRAVE_API_KEY for web_search.
  • legion-tool-ha-cfgHA_URL (e.g. http://home-assistant.home-assistant:8123 for in-cluster HA) and HA_TOKEN (a long-lived access token) for home_assistant.

Both Secrets are marked optional on the Deployment; missing entries make the corresponding starter tool return an instructive error instead of silently misbehaving.

${VAR} and ${VAR:-default} expand from the environment at load time. A fresh install Just Works with defaults — aliases auto-remap to mock when their configured target is unreachable.

Runtime requirements

Local dev / single-binary run:

  • Go 1.25+ on $PATH (the toolhub still shells out to go build to compile agent-written tools).
  • Optional: Anthropic API key for the highest-quality model tier.
  • Optional: a local LM Studio / Ollama server for self-hosted models — any OpenAI-compatible /v1/chat/completions endpoint with tool_calls streaming works via the openai_compat provider.
  • Optional: Brave Search API key for the web_search starter tool (free tier is generous).
  • Optional: Home Assistant URL + long-lived access token for the home_assistant starter tool (control lights from agents).

Cluster deployment (see docs/nats-transport.md):

  • Postgres (CNPG) — agents, memory, tasks, crons, watchers, tool registry metadata, provider catalog.
  • NATS — transport plane (legion.evt.>, legion.cmd.>, legion.fleet.>, legion.tools.>).
  • MinIO (single-replica legion-storage) — content-addressed tool binaries; 5-minute presigned URLs for daemon fetches.

Repository layout

cmd/
  legion-node/   daemon entrypoint
  legion/    CLI + TUI client
internal/
  agent/        per-agent supervisor + state machine + inflight session cap
  agentrepo/    Postgres-backed agent registry (legion.agents)
  config/       TOML loader with env-var expansion
  crons/        per-agent cron scheduler with suspend/resume (Postgres-backed)
  events/       event bus + per-agent ring buffer + NATS bridge
  gateway/      WS server with bearer auth (legion-gateway pod)
  ipc/          NATS-routed control plane + WS shim
  kernel/       lean tool catalog + LLM session runner + reflection
  lifecycle/    spawn / destroy / graveyard
  llm/          provider abstraction (Anthropic, openai-compat, Ollama, mock)
  memory/       Postgres-backed agent memory (legion.agent_memory)
  modelhub/     in-cluster provider catalog client
  orchestrator/ singleton placement + failover authority (legion-orchestrator)
  pgconn/       pgxpool helper with retry semantics
  providers/    NATS-routed provider catalog (legion-providers pod)
  sandbox/      path-escape enforcement
  starter/      embedded starter library (legion-seed-tools one-shot job)
  tasks/        Postgres-backed task board (legion.tasks)
  toolhub/      NATS client for the legion-tools registry
  tools/        in-process builder + invoker + rolling Stats
  tui/          Bubble Tea views (dashboard, attach, timeline, reasoning)
  watchers/     Postgres-backed passive monitors (legion.watchers)
docs/
  tool-registry.md  design + lifecycle for the global tool store
pkg/
  toolio/     shared ABI for agent-written tools
  watchio/    shared ABI for agent-written watchers
deploy/       systemd unit + env template
scripts/      install + update helpers

Tests

go test ./...

Integration smoke tests gated on ANTHROPIC_API_KEY (e.g. TestAnthropicSmoke) skip when the env var is unset, so plain go test ./... stays offline-safe.

# One-shot dashboard render to stderr — useful for eyeballing
# layout after a TUI change.
go test ./internal/tui/ -run TestSnapshot -v