Skip to content
BAEM1N.DEV — AI, RAG, LLMOps 개발 블로그
Go back

DeepCoWork: Building an AI Agent Desktop App with Deep Agents SDK

TL;DR: DeepCoWork is an open-source AI agent desktop app built on Deep Agents SDK + Tauri 2, supporting 5 LLM providers and HITL approval for dangerous operations.

Table of contents

Open Table of contents

Why I Built This

When Claude Cowork launched, I thought: “I can build this as open source.” Anthropic’s Deep Agents SDK was already Apache 2.0, and Tauri 2’s sidecar feature could embed a Python backend into a desktop app. Four clear primitives stood out:

ComponentRole
Planning ToolTask decomposition and priority management
SubagentsDomain-specialized isolated workers
Virtual FilesystemShared memory between agents
System PromptBehavioral guidelines for complex scenarios

Building it myself means no model lock-in, local LLM support, and full control over prompts.

Architecture

DeepCoWork Architecture

Three Layers

  1. Tauri (Rust) — Native window, Python process management, CSP security
  2. React (TypeScript) — Chat UI, SSE streaming, Zustand state
  3. FastAPI (Python) — Deep Agents SDK, LLM calls, tool execution, HITL

Tauri spawns the Python backend as a sidecar process, and the frontend receives real-time tokens via SSE.

Core Code: Agent Creation

The agent core lives in a single file — agent_core.py:

from deepagents import create_deep_agent
from deepagents.backends import LocalShellBackend

def build_agent(workspace_dir, checkpointer, mode, tools):
    llm = build_llm()  # picks from 5 providers

    backend = LocalShellBackend(
        root_dir=str(workspace_dir),
        virtual_mode=False,
        timeout=60,
        max_output_bytes=50_000,
    )

    return create_deep_agent(
        model=llm,
        tools=tools,
        backend=backend,
        interrupt_on={
            "write_file": True,
            "edit_file": True,
            "execute": True,
        },
        checkpointer=checkpointer,
        system_prompt=prompt,
        skills=["skills/"],
    )

What create_deep_agent() handles internally:

Key Features

4 Execution Modes

ModeRoleBehavior
ClarifyRequirements gatheringInvestigates first, asks only essential questions
CodePair programmingMinimal changes, follows existing patterns
CoworkAutonomous executionCreates plan.md → executes step by step
ACPMulti-agentDelegates everything to sub-agents

Human-in-the-Loop (HITL)

Dangerous operations like file writes and shell execution are never auto-executed:

Agent calls write_file → interrupt_on triggers
  → Frontend shows approval modal
  → User: Approve or Reject
  → Approved: execute tool
  → Rejected: agent finds alternative approach

30-second timeout with auto-reject — safe even if left unattended.

Skills System

Drop a SKILL.md file in ~/.cowork/workspace/skills/ and the agent gains new capabilities:

---
name: code-review
description: Performs systematic code review
allowed-tools: read_file glob grep execute
---

# Code Review Skill
## When to Use
- When the user requests a code review
...

The agent sees metadata first, reads full instructions only when needed (progressive disclosure).

5 LLM Providers

ProviderTypeModel Selection
AnthropicCloudText input
OpenRouterCloudText input
OllamaLocalAuto-fetched from server
LM StudioLocalAuto-fetched from server
vLLMLocalAuto-fetched from server

Local providers auto-detect available models via /v1/models API.

Tech Stack

LayerTechnologyRole
DesktopTauri 2 (Rust)Window, process management, CSP
FrontendReact 19 + ZustandUI, state management
StylingTailwind CSS 4Dark/Light theme
BackendFastAPI + uvicornREST + SSE
AgentDeep Agents SDKReAct loop, tools, HITL
LLMLangChainProvider abstraction
DBSQLite (LangGraph)Checkpointer, thread metadata
BuildPyInstaller + GitHub ActionsCross-platform

Install & Run

Grab the installer for your OS from GitHub Releases:

No Python installation needed — the agent server is bundled via PyInstaller.

Development Mode

git clone https://github.com/BAEM1N/deep-cowork.git
cd deep-cowork

cd app && npm install
cd ../agent && python -m venv .venv && source .venv/bin/activate
pip install -e .

echo "LLM_PROVIDER=openrouter" > .env
echo "OPENROUTER_API_KEY=sk-or-..." >> .env
echo "MODEL_NAME=anthropic/claude-sonnet-4-5" >> .env

cd ../app && npm run tauri dev

Series Preview

This series deep-dives into each layer of DeepCoWork:

  1. This post — Introduction & architecture overview
  2. Tauri + Python sidecar architecture
  3. Deep Agents SDK internals
  4. System prompt design per mode
  5. SSE streaming pipeline
  6. HITL approval flow
  7. Multi-agent ACP mode
  8. Agent memory 4 layers
  9. Skills system
  10. LLM provider integration
  11. Security checklist
  12. GitHub Actions cross-platform build

Source code: github.com/BAEM1N/deep-cowork

Benchmark

MetricValue
Tauri app binary (excluding sidecar)~12MB (macOS arm64)
PyInstaller sidecar binary~95MB (DeepAgents + LangChain + FastAPI)
Total .dmg installer size~110MB
Cold start (app launch to first chat ready)~4.2s (M1 Mac)
Idle memory usage~180MB (Tauri ~45MB + Python ~135MB)

FAQ

How is DeepCoWork different from Claude Cowork?

Claude Cowork is Anthropic’s closed-source product. DeepCoWork uses the same Deep Agents SDK but is fully open-source (MIT), model-agnostic, and supports local LLMs.

Why Tauri over Electron?

Smaller binary (10MB vs 150MB+), lower memory usage, and Rust gives stable Python process management.

Why Deep Agents SDK instead of building with create_react_agent?

create_react_agent’s interrupt_before only accepts node names — not tool names — so per-tool HITL is impossible. DeepAgents’ interrupt_on supports fine-grained tool-level control, and LocalShellBackend provides 8 filesystem/shell tools out of the box.

Does it work with local LLMs?

Tested with Ollama + llama3.1 8B. Basic file operations work, but complex multi-agent tasks need Claude/GPT-4 class models.


AI-assisted content
Share this post on:

Previous Post
DeepCoWork #2: Tauri 2 + Python Sidecar -- The Skeleton of a Desktop AI App
Next Post
Building a GraphRAG Pipeline — From Vector Search to Graph Expansion