Introduction

asyncat is a self-hosted, full-stack AI agent platform — 208 tools, 49 bundled skills, self-improving, and fully offline-capable.

What is asyncat

asyncat is an open-source AI agent OS you run on your own hardware as a native desktop application. It packages a backend API server (den) and a React web interface (neko) inside a native Electron container wrapper (electron).

The agent can read and write files, run shell commands and Python scripts, navigate the web, click on your actual desktop, query databases, create artifacts, schedule recurring jobs, manage a memory store, and delegate work to sub-agents — using any model from a 20+ provider list or a fully local GGUF model via llama.cpp, Ollama, or MLX.

Nothing leaves your machine unless you explicitly configure a cloud provider. The entire stack runs locally over two ports: 8716 (backend) and 8717 (frontend).

Agent runtime

asyncat's agent core implements a ReAct loop (Reasoning + Acting) — the agent reasons about the goal, selects a tool, executes it, observes the result, and repeats up to 25 rounds.

Key behaviors of the runtime:

Smart loop detection — a sliding window of 8 normalized tool signatures detects stuck loops and triggers a strategy switch.
Context compaction — when conversation history exceeds the token budget, the runtime compresses it (LLM-assisted if available, mechanical otherwise) without losing the goal context.
Session persistence — every run is saved to SQLite with a full tool audit trail, resumable after restarts.
Permission system — tools are tagged safe, moderate, or dangerous. Interactive prompts and auto-approve levels give you fine-grained control.
Unified model adapter — works with native API tool calls, XML tags, and JSON blocks, so any model works regardless of how it formats tool calls.

Brain systems

asyncat is built around five interconnected systems, each named after the brain region it mirrors:

System	What it does
Prefrontal Cortex	Goal planning, task decomposition, agent reasoning
Hippocampus	Persistent memory — stores and retrieves context across sessions
Cerebellum	Skills — 49 bundled knowledge modules the agent can load
Amygdala	Soul — personality, tone, and behavioral constraints
Basal Ganglia	Self-improvement — auto-detects repeated patterns and creates new skills

The Basal Ganglia is particularly noteworthy. It runs passively, tracking tool sequences across sessions. After a pattern succeeds 3+ times in a 72-hour window, it auto-generates a named skill using an LLM (or a mechanical template if no LLM is configured). The agent continuously improves without manual intervention.

Agent modes

asyncat supports three operating modes that change how the agent behaves:

Mode	Behavior	Good for
`chat`	Conversational, fewer tool nudges, quick convergence	Q&A, research, explanation
`plan`	Generates a structured plan only — no tool execution, no writes	Safe preview before action
`action`	Full ReAct loop — runs tools, edits files, executes code	Coding, automation, complex tasks

Why asyncat

Unlike those minimalist agent frameworks that give the model three tools and pray it doesn't break, asyncat is a context-heavy powerhouse. We dump 208 tools directly into the context window and expect the LLM to figure it out. Surprisingly, it does.

Most AI agents are good at one thing. asyncat is built to handle everything an autonomous agent might need:

Native Desktop Integration — lives in your system tray, responds to a global keyboard shortcut (Cmd/Ctrl+Shift+A), and triggers native OS notifications.
208 tools across files, shell, git, browser, AI vision, memory, scheduling, artifacts, database, Docker, sandboxes, notes, tasks, and system monitoring.
Desktop automation — the only open-source agent that can click buttons, read screenshots via OCR, type into windows, and focus applications.
Self-improving — the Basal Ganglia learns your workflows and auto-creates skills without any annotation.
Sandboxed execution — risky operations run in isolated copies with full diff and patch review before promotion.
Zero cloud dependency — fully offline with llama.cpp, Ollama, or MLX and auto-detects CUDA, Metal, and ROCm GPUs.
Full workspace — notes, tasks, kanban, calendar, projects, and integrations with GitHub, Google Calendar, Obsidian, and email — all in the same session context.