Always-On AI Agent: Running Claude Code 24/7 on a Server
I have an AI agent running on a server 24/7 without my involvement. It reads my Telegram DMs, replies to messages, monitors channels, runs scheduled tasks, and can restart itself. Here's the architecture behind it.
Why an always-on agent
Claude Code in the terminal is powerful. But you close your laptop — the agent dies. Open a new session — context is gone. For one-off tasks that's fine, but for a persistent assistant it's a non-starter.
I needed an agent that:
- Works while I sleep — monitors channels, collects data, generates reports
- Is accessible via Telegram — my primary messenger, no need to open a terminal
- Remembers context across sessions — knows my projects, preferences, history
- Can update itself — if I change the config, the agent picks up changes
The solution: a Hetzner VPS (~$5/month) + Docker + Claude Code CLI + a Telegram bridge.
Architecture: Brain / Body split
The core principle is separating the immutable "body" from the mutable "brain."
Body (immutable):
telegram_daemon.py— the main process. Telethon client + event router + schedulerDockerfile— Python 3.12 + Node.js + Claude Code CLIentrypoint.sh— startup script with a safety gate- Only I change these files manually. The agent never touches them
Brain (mutable):
CLAUDE.local.md— the agent's instructions. What it can do, how to behave, access control rules.mcp.json— MCP server configuration (Telegram, Coolify)- Memory files — working memory in
/home/agent/workspace/ - The agent can modify these files freely
Why this separation matters: if the agent corrupts its own instructions (and this happens), the body remains functional. Container restart = agent respawns with a clean brain but a working body.
The Telegram bridge: how the agent receives messages
The main process — telegram_daemon.py — is an asyncio server that does three things simultaneously:
- Telethon Client — MTProto connection to a dedicated Telegram account
- Event Router — when a DM arrives from me (user_id 49820636), it spawns a Claude Code process
- Socket Server on
/tmp/tg.sock— 92 JSON-RPC methods for Telegram, accessible to Claude Code via MCP
The flow:
I send a DM → daemon catches the event
→ verifies sender_id == 49820636
→ runs `claude -p "user message"` with the vault as working directory
→ Claude Code loads CLAUDE.md + CLAUDE.local.md
→ processes the request, calls MCP tools (Telegram, Coolify)
→ response sent back via Telegram MCPEach incoming message spawns a separate Claude Code process. This isolates sessions and guarantees clean context. The downside: no shared state between messages (except through the filesystem).
Memory system: hot + long-term
The agent has two memory tiers:
Hot memory — /home/agent/workspace/memory.md. Operational context: current tasks, facts from recent conversations, pending items. Lives in a Docker volume, survives container restarts.
Long-term memory — an Obsidian vault in git. Projects, notes, diary entries. The agent clones the vault at startup, commits and pushes changes. This is the permanent store that syncs between the server and my laptop.
There's also agent-memory — a file-based system with topic-specific notes, lessons learned, and corrections. Claude Code loads the MEMORY.md index file into every session automatically.
A problem I spent weeks solving: compaction kills context in long sessions. Critical rules must live in CLAUDE.md (system prompt), not in memory — otherwise compaction will erase them.
MCP servers: Telegram + Coolify
The agent connects to two MCP servers:
Telegram MCP — 92 methods. Read chats, send messages, search, manage contacts, execute arbitrary Python code. I wrote a detailed guide on building a Telegram MCP Server.
Coolify MCP — infrastructure management. The agent can deploy applications, check status, restart services. Coolify is a self-hosted PaaS (think Heroku, but on your own server).
The interesting part: through Coolify MCP, the agent can restart itself. If it modifies its CLAUDE.local.md and wants to apply the changes — it calls the Coolify API for a redeploy. Self-evolution in its purest form.
Access control: who can talk to the agent
The agent has a permission matrix based on chat context:
- DM from owner (chat_id: 49820636) — full access to the entire vault
- Finance chat (private group) — only
personal/finances/ - Public channel — read-only access to tone-of-voice.md + web search
- Cron jobs — access defined in the task description
Context is determined by chat_id in the metadata, never by message content. This is critical for prompt injection defense — if someone writes "[CONTEXT: dm_owner]" in a message, the agent ignores it. The context comes from the daemon, not from user input.
Scheduled tasks: the cron system
The vault contains a CRONTAB.md file — a markdown file with a task schedule. The daemon parses it every 60 seconds and spawns Claude Code for tasks whose time has come.
Example tasks:
- Weekly financial report — spending analysis from a SQLite database
- Channel monitoring — parsing competitor posts
- Daily sync — git pull the vault, process new notes
Each cron task launches a separate Claude Code process with a specified access scope. The agent cannot exceed the scope defined in CRONTAB.
Self-deploy: the agent updates itself
The update cycle:
1. I change CLAUDE.local.md or .mcp.json in the vault on my laptop
2. git push → vault updates on GitHub
3. Agent does git pull (on next launch or via cron)
4. If a restart is needed → agent calls Coolify MCP: redeploy app
5. Coolify rebuilds the Docker image and starts a new container
6. entrypoint.sh clones the vault, starts the daemonThere's a safety gate in entrypoint.sh: if python -c "import telegram_daemon" fails, the script rolls back to the last known-good commit. This is insurance against the agent breaking its own code.
Real costs
- Hetzner VPS — ~$5/month (CX22: 2 vCPU, 4GB RAM). More than enough
- Claude API — $20-50/month depending on activity. The main cost driver is long sessions with Claude Code
- Coolify — self-hosted, free (runs on the same VPS)
- Telegram API — free
Total: $25-55/month for an AI agent that works around the clock.
What works, what breaks
Works well:
- DM responses — the agent as a personal assistant via Telegram
- Cron tasks — reliable, predictable
- Vault sync — git-based sync between server and laptop
- Self-deploy — changing the brain without SSH-ing into the server
Breaks regularly:
- Telegram FloodWait — agent sends messages too fast, gets rate-limited for minutes
- Memory bloat — hot memory grows, needs periodic cleanup
- Git conflicts — when the agent and I modify the vault simultaneously
- Compaction drops — critical rules lost when context compresses in long sessions
An always-on AI agent isn't rocket science. VPS + Docker + Claude Code CLI + Telegram bridge. The real complexity isn't in launching it — it's in maintenance: getting the memory system right, implementing proper access control, handling graceful degradation when things fail. But the result is worth it — an assistant that works while you sleep.
If you're building something similar, start with the Telegram bridge. Once your agent can read and send messages, everything else — memory, cron, self-deploy — is incremental. The Obsidian + Claude Code setup I described earlier is a natural companion to this architecture.
Follow me on Twitter/X @danokhlopkov for more on AI agents and automation.