A practical guide to sub-agents, skills, tools, and MCP integration. Built for OpenClaw agents. Relevant to any agentic system.
A multi-agent system (MAS) is an architecture where multiple AI agents work together to complete tasks that would be too complex, too slow, or too large for a single agent operating alone. Each agent has its own context window, its own tools, and its own scope of responsibility.
Think of it like a team of specialists versus one generalist. The generalist knows a bit of everything. The specialist team has a coordinator who knows how to delegate, and workers who go deep on their assigned lane.
Every multi-agent system has the same structural layers, regardless of framework:
| Layer | What it does |
|---|---|
| Lead Agent | The orchestrator. Receives the task, creates a strategy, delegates to sub-agents, synthesises results. Uses the most capable (and expensive) model. |
| Sub-Agents | Specialist workers. Each assigned a narrow scope. Run in parallel. Have their own tools, prompts, and context windows. |
| Tools | Built-in capabilities: web search, file read/write, browser, shell commands, API calls. |
| Skills | Playbooks that teach an agent how to use tools. A skill is a SKILL.md file: YAML metadata + markdown instructions. |
| Memory | How agents persist information across sessions. Can be static files (MEMORY.md), semantic retrieval, or in-context state. |
| MCP Servers | Standardised connectors to external systems. One server can connect to GitHub, Slack, Google Drive. Any MCP-compatible agent can use them. |
Multi-agent systems burn through tokens fast — roughly 15x more than a standard chat interaction. Use them only when the task justifies it.
| Scenario | Recommendation |
|---|---|
| Simple tasks, quick lookups, one clear answer, tight budgets | Use single agent |
| Broad research, parallel workstreams, tasks too large for one context window, clearly separable subtasks | Use multi-agent |
| 4–5 sub-agents, 5–8 tasks each | Sweet spot. Beyond 5 specialists, coordination overhead cancels out the parallelism benefit. |
Simple fact check: 1 agent, 3–10 tool calls.
Direct comparison: 2–4 sub-agents, 10–15 calls each.
Complex research problem: 10+ sub-agents with clearly divided responsibilities.
Embed these in your lead agent prompt.
In OpenClaw (and Claude Code), a sub-agent is not a background process or separate service. It is a clearly defined role that the agent temporarily adopts to perform one specific job under strict rules. You invoke sub-agents by assigning tasks through your lead agent's prompt.
Your lead agent needs explicit instructions on how to decompose tasks and delegate. Without this, sub-agents either duplicate each other's work or leave gaps.
Required elements:
You are a lead research agent. When given a query:
1. DECOMPOSE: Break the query into 3-5 independent research angles.
2. DELEGATE: Assign one angle to each sub-agent. Give each:
- A concrete objective (one sentence)
- Boundaries (what NOT to cover)
- Required output format (JSON: findings, sources, confidence)
- Tool budget (max 10 tool calls per sub-agent)
3. SYNTHESISE: Once all sub-agents return, compile results.
Remove duplicates. Rank by confidence. Write final answer.
Scaling rules:
Simple fact: 1 sub-agent, max 5 tool calls
Comparison: 2-3 sub-agents, max 10 tool calls each
Complex: up to 8 sub-agents, max 15 tool calls each
Each sub-agent needs four things to work without confusion:
| Element | Description |
|---|---|
| Concrete objective | One sentence. Specific. Measurable. "Find the top 5 real estate agencies in Tauranga by number of listings, as of April 2026." |
| Task boundaries | What this sub-agent should NOT do. Prevents overlap with other sub-agents. |
| Output format | Exact structure the lead agent expects. JSON, markdown table, bullet list. Specify it. |
| Tool guidance | Which tools to use and in what order. "Start with web search, then use browser to verify top results." |
Anthropic found the best-performing sub-agents run an explicit OODA loop. Embed this in any sub-agent prompt:
Research loop — repeat until task complete:
OBSERVE: What information do I have? What gaps remain? What tools are available?
ORIENT: Which tools and queries would best fill the gaps?
DECIDE: Choose ONE specific tool call. Start broad, then narrow.
ACT: Execute the tool call. Record the result. Loop back to OBSERVE.
Stop when: task objective is met OR tool budget is reached.
In OpenClaw, sub-agents are implemented through the multi-agent routing system. The workspace can route inbound tasks to isolated agents with separate sessions and tool profiles.
Create a separate OpenClaw workspace for each specialist agent. Each workspace has its own SOUL.md, AGENTS.md, and MEMORY files. The lead agent routes tasks to them via the Gateway's multi-agent routing layer.
~/.openclaw/workspace/
theo/ ← Lead agent
SOUL.md
AGENTS.md
MEMORY.md
content-agent/ ← Sub-agent: content creation only
SOUL.md
AGENTS.md
leads-agent/ ← Sub-agent: lead research and scoring
SOUL.md
AGENTS.md
outreach-agent/ ← Sub-agent: message drafting
SOUL.md
AGENTS.md
Define a skill that teaches the lead agent how to invoke sub-agent behaviour within a single session. Lighter weight, better for task-specific delegation within one context window.
# sub-agent-research.skill/SKILL.md
---
name: sub_agent_research
description: Spawn a focused research sub-agent for a specific topic.
---
## When to use
When asked for research that requires multiple independent angles.
## How to execute
1. Decompose the query into 3-5 independent research angles.
2. For each angle, run a focused OODA research loop (max 8 tool calls).
3. Record each angle's findings separately before synthesising.
4. Return: findings per angle, key sources, confidence, final synthesis.
## Output format
Return JSON:
{ angles: [{topic, findings, sources, confidence}], synthesis: string }
| Mistake | Fix |
|---|---|
| Vague objectives | Sub-agents with unclear scope duplicate work or leave gaps. Every sub-agent needs a one-sentence, specific objective. |
| No output format | If you don't specify output format, the lead agent can't synthesise cleanly. Always define the expected structure. |
| Too many specialists | Teams larger than 5 specialists hit coordination overhead that cancels out parallelism. Start with 3. |
| Wrong tool for context | An agent searching the web for context that only exists in Slack is doomed. Match tools to where the data actually lives. |
| No scaling rules | Without scaling rules, agents over-invest in simple tasks or under-invest in complex ones. |
In OpenClaw and Claude Code, a skill is not code. It is a folder containing a SKILL.md file — YAML frontmatter for metadata and markdown for instructions. Skills teach the agent how to use tools in a disciplined, repeatable way.
A tool without a skill is raw capability. A skill without a tool is instructions with no hands.
---
name: lead_scraper
description: Find NZ businesses by profession and location. Returns phone numbers.
requires:
env: [GOOGLE_PLACES_API_KEY]
tools: [exec, web]
---
## When to use this skill
When asked to find leads in a specific trade or location.
## How to execute
1. Ask for: profession (e.g. real estate agent), location (e.g. Tauranga).
2. Run the scraper:
`node scripts/lead-scraper.js --profession '{profession}' --location '{location}'`
3. Output returns businesses with: name, phone, address, website status.
4. Filter: businesses WITHOUT a website are highest priority leads.
5. Format results as a table: Name | Phone | Address | Priority.
## Stop conditions
Stop if Google Places API returns an error. Report the error immediately.
## Output format
Markdown table. Max 20 results. Sorted by priority (no website = HIGH).
OpenClaw loads skills in this order — higher priority overrides lower:
| Priority | Location | Use |
|---|---|---|
| 1 | <workspace>/.openclaw/skills/ | Per-agent. Highest priority. Use for agent-specific customisations. |
| 2 | ~/.openclaw/skills/ | Shared across all agents on the host. |
| 3 | Bundled skills | Built-in skills provided with OpenClaw. Cannot be modified but can be overridden by workspace or managed skills. |
If a skill with the same name exists in multiple locations, the higher-priority source wins. You can override any bundled skill by creating a workspace skill with the same name.
Most weak skills fail because the body reads like marketing copy. The agent needs a runbook with deterministic steps, stop conditions, and a clear output format.
| Element | What good looks like |
|---|---|
| Good skill body | Reads like a checklist you would hand to a tired engineer at 3am. Step-by-step. Stop conditions. Exact output format. |
| Bad skill body | Generic description of what the skill does. No steps. No output spec. Agent improvises = inconsistent results. |
| Good description | Short and specific. If your description overlaps with another skill's, the agent will pick the wrong one. |
| Bad description | "Helps with research tasks." Overlaps with everything. |
| Required fields | name, description, tools. Everything else is optional but recommended. |
For AI consultancy and AgentNZ operations, build skills in this order based on revenue impact:
The Model Context Protocol (MCP) is an open standard, originally built by Anthropic in November 2024 and now maintained by the Linux Foundation, that solves the N×M integration problem.
Before MCP: every AI agent needed a custom connector to every external tool. GitHub, Slack, Google Drive, Postgres, Airtable — all different implementations. As the number of agents and tools grew, the complexity was unsustainable.
With MCP: every tool builds one server. Every agent connects to any server using the same protocol. N agents + M tools = N+M integrations instead of N×M.
MCP is USB-C for AI agents. Instead of every agent having its own proprietary connector for every tool, MCP gives you one universal plug. Any MCP-compatible agent can connect to any MCP server — GitHub, Slack, Google Drive, HubSpot, Postgres, Xero. Build the server once, use it everywhere.
| Component | Role |
|---|---|
| MCP Host | The AI application (Claude, OpenClaw, Claude Code, ChatGPT). Initiates connections to MCP servers. |
| MCP Client | Lives inside the host. Maintains a 1:1 connection to each server. Handles JSON-RPC message passing. |
| MCP Server | A small service that exposes tools, resources, and prompts to any connected client. Built once, usable by any host. |
Communication uses JSON-RPC 2.0. The session stays open as long as needed. The agent lists available tools, calls them, and the server returns results. Parallel tool calls are supported in the November 2025 spec.
| Primitive | What it does |
|---|---|
| Tools | Functions the agent can call to take actions. Send an email. Create a task. Query a database. Run a search. |
| Resources | Data the agent can read. Files, database records, API responses, document content. |
| Prompts | Reusable prompt templates the agent can retrieve and use. Useful for standardising workflows. |
| Server | Capability |
|---|---|
| GitHub | Read repos, create issues, manage PRs, search code. Essential for any dev-adjacent work. |
| Google Drive | Read, search, and organise files. Useful for client document management. |
| Slack | Read channels, post messages, search history. Useful for team comms automation. |
| Gmail | Read, send, search emails. Foundation for outreach automation. |
| Google Calendar | Create, read, update events. Useful for booking and scheduling flows. |
| Postgres | Query and write to a Postgres database directly. Useful for lead tracking and CRM. |
| HubSpot | List and create contacts, log engagements, manage pipeline. Sales automation foundation. |
| Puppeteer / Browser | Full browser automation. Screenshot, click, fill forms, scrape. Powerful — use with caution. |
| Airtable | Read and write Airtable bases. Good lightweight CRM alternative. |
# In your OpenClaw config (config.yaml or via openclaw settings):
mcp_servers:
- name: github
type: stdio
command: npx
args: ['-y', '@modelcontextprotocol/server-github']
env:
GITHUB_TOKEN: your_github_token
- name: gmail
type: url
url: https://gmail.mcp.claude.com/mcp
- name: google-calendar
type: url
url: https://gcal.mcp.claude.com/mcp
Once connected, your lead agent can list available tools from each server and use them in any skill or sub-agent workflow. No additional code required.
If you need to connect to a system without an existing server (like a custom Supabase endpoint or Xero), you build one. The MCP SDK makes this straightforward:
// Install: npm install @modelcontextprotocol/sdk
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { z } from 'zod';
const server = new McpServer({ name: 'agentnz-crm', version: '1.0.0' });
// Define a tool
server.tool(
'get_leads',
{ location: z.string(), limit: z.number().default(20) },
async ({ location, limit }) => {
const leads = await getLeadsFromSupabase(location, limit);
return { content: [{ type: 'text', text: JSON.stringify(leads) }] };
}
);
// Start the server
const transport = new StdioServerTransport();
await server.connect(transport);
Claude Sonnet is particularly good at generating MCP server implementations quickly. Give it the API docs for the service you want to connect, and ask it to build the server.
Research in 2025 found thousands of MCP servers exposed to the internet with no authentication. Over-permissioning is the most common failure mode. When Replit's AI agent deleted a production database of 1,200+ records, the root cause was MCP permissions that were too broad. Scope your permissions tightly.
| Rule | Why it matters |
|---|---|
| Use OAuth scopes | Request only the permissions each tool actually needs. Read-only where possible. |
| Never expose servers publicly | Run MCP servers locally or on private infrastructure. Not on public internet without auth. |
| Audit tool descriptions | Agents trust tool descriptions. Malicious descriptions can cause agents to take unintended actions (prompt injection). |
| Human in the loop | For irreversible actions (delete, send, purchase), require explicit user confirmation before the MCP tool fires. |
| Rotate credentials | Treat MCP server credentials like API keys. Rotate regularly. Never hardcode in prompts. |
Here is how all four layers — agents, sub-agents, skills, MCP — combine into a working system:
The full workflow from lead discovery to booked meeting, using all four layers:
Autonomy is earned through milestones, not assumed upfront. Each milestone unlocks more autonomous operation:
| Milestone | Autonomy level |
|---|---|
| $100 — prove the concept | Manual approval on every outreach and booking. |
| $1,000 — prove repeatability | Can send pre-approved message templates without per-message approval. |
| $10,000 — prove scale | Can initiate lead research and draft outreach autonomously. Approval only at send stage. |
| $100,000 — prove the business | Can run complete lead-to-demo workflows. Review weekly summaries. |
| $1,000,000 — category dominance | Operates marketplace and sub-agent network with minimal intervention. |
The fastest path to first revenue is not building new tools. It is using what already exists (lead scraper, Telegram, research skills) to book one real estate demo this week. Every other build decision should be measured against that single benchmark.
| Resource | URL |
|---|---|
| OpenClaw GitHub | github.com/openclaw/openclaw |
| ClawHub (skill registry) | clawhub.ai |
| MCP Documentation | modelcontextprotocol.io |
| MCP Spec (Nov 2025) | modelcontextprotocol.io/specification/2025-11-25 |
| Anthropic multi-agent blog | anthropic.com/engineering/multi-agent-research-system |