
Building Custom ML Agent Orchestration: From Theory to Pi Coding Agent
I wanted to build something like oh-my-opencode, but fully customizable. My own agent orchestration system where I control the routing, the models, the skills—everything.
So I researched. I dug through LangGraph's architecture. I traced through oh-my-opencode's source code line by line. Then I implemented it all in pi-coding-agent.
What I found changed how I think about agent systems. The intelligence isn't in the code. It's in the prompts. And the real power? Model routing.
The Research
I started by launching parallel research agents—my usual approach for unfamiliar territory. One searched framework patterns (LangGraph, CrewAI, AutoGen). One traced through oh-my-opencode's TypeScript source.
The findings converged on a surprising insight: the most successful agent implementations use simple, composable patterns—not complex frameworks.
From my research:
"Start simple, add complexity only when measurable improvement is demonstrated. Frameworks create abstraction layers that obscure control flow and hinder debugging."
The frameworks I initially considered—LangGraph, CrewAI, AutoGen—are useful for their primitives. But the orchestration logic? That's just configuration and prompts.
The Core Pattern: Agents as Configuration
Here's the key insight from oh-my-opencode: an agent is just a configuration object.
interface AgentConfig {
description: string; // Short description for routing
model: string; // e.g., "openai/gpt-4o"
temperature?: number;
tools?: { write?: boolean }; // Tool restrictions
prompt: string; // The agent's "brain"
}
That's it. The "Oracle" agent isn't special code—it's a read-only model with a specific system prompt. The "Explore" agent isn't magic—it's a fast model restricted to search tools with instructions to fire 3+ tools in parallel.
The intelligence comes from the prompt, not the implementation.
Model Selection as Architecture Decision
This is the insight that changed everything for me: different tasks need different models.
Not "use the best model for everything." Not "use the cheapest model for everything." Use the right model for the task at hand.
Here's my routing table:
| Task Complexity | Model | Why |
|---|---|---|
| Trivial (typos, simple fixes) | Gemini 3 Flash | Speed matters, quality doesn't |
| Medium (standard implementation) | Claude Sonnet 4.6 | Good balance of speed and quality |
| Complex (multi-file, tricky logic) | Claude Opus 4.6 | Need best reasoning |
| Expert (novel problems, research) | GPT-5.4 | Maximum capability |
Why This Matters
1. Cost control without quality sacrifice
Running everything through Opus 4.6 would cost 30x more than routing trivial tasks to Flash. For a typo fix, Flash is better—it's faster, and the quality difference is irrelevant.
2. Speed where it matters
Exploration agents fire constantly during my workflow. If each one took 30 seconds (Opus) instead of 3 seconds (Flash), I'd stop using them. Fast models enable high-frequency delegation.
3. Quality where it matters
When I'm debugging a race condition after two failed attempts, I don't want Flash. I want the best reasoning model available, cost be damned.
Implementing It in Pi Coding Agent
Armed with this understanding, I configured pi-coding-agent (v0.52.7) to implement these patterns. Everything lives in ~/.pi/agent/:
~/.pi/agent/
├── settings.json # Main config
├── orchestrator.json # Model routing & sub-agents
├── keybindings.json # Custom shortcuts
├── AGENTS.md # Global context
├── extensions/
│ ├── orchestrator.ts # Orchestration logic
│ └── permission-gates.ts
├── skills/
│ ├── handoff/SKILL.md
│ ├── code-review/SKILL.md
│ └── ...
└── prompts/
├── commit-message.md
└── pr-description.md
The Model Routing Config
In orchestrator.json, I implemented the routing table:
{
"modelRouting": {
"trivial": {
"provider": "google-antigravity",
"model": "gemini-3-flash"
},
"simple": {
"provider": "google-antigravity",
"model": "gemini-3-flash"
},
"medium": {
"provider": "github-copilot",
"model": "claude-sonnet-4.6"
},
"complex": {
"provider": "github-copilot",
"model": "claude-opus-4.6"
},
"expert": {
"provider": "google-antigravity",
"model": "gemini-3-pro-high"
}
},
"complexitySignals": {
"trivial": ["^what is", "^where is", "^show me", "typo"],
"simple": ["rename", "format", "lint", "add comment"],
"medium": ["add function", "fix bug", "refactor"],
"complex": ["architect", "migrate", "security review"],
"expert": ["race condition", "memory leak", "distributed"]
}
}
The orchestrator extension classifies prompts by matching against these signals, then routes to the appropriate model.
Sub-Agents
I configured specialized sub-agents that trigger on specific patterns:
{
"subAgents": {
"reviewer": {
"enabled": true,
"thinkingLevel": "medium",
"triggers": ["review", "check.*code", "pr\\s+review"],
"systemPrompt": "You are a code review specialist. Focus on logic correctness, security vulnerabilities, performance issues, type safety, maintainability..."
},
"debugger": {
"enabled": true,
"thinkingLevel": "high",
"triggers": ["debug", "fix.*bug", "not\\s+working"],
"systemPrompt": "You are a debugging expert. Follow: Reproduce → Isolate → Hypothesize → Test → Fix..."
},
"explainer": {
"enabled": true,
"provider": "google-antigravity",
"model": "gemini-3-flash",
"triggers": ["explain", "how.*work", "what.*does"]
}
}
}
The explainer uses cheap Gemini Flash—perfect for "what does this do?" questions that don't need heavy reasoning.
Four Orchestration Modes
The extension registers keyboard shortcuts for switching modes:
| Shortcut | Mode | Behavior |
|---|---|---|
Ctrl+Shift+1 | Direct | No orchestration, use default model |
Ctrl+Shift+2 | Delegate | Route to sub-agents based on triggers |
Ctrl+Shift+3 | Plan | Step-by-step execution with templates |
Ctrl+Shift+4 | Auto | Smart routing based on complexity signals |
Ctrl+Shift+0 | Status | Show current mode and routing |
Skills as Prompt Injection
Skills are markdown files that get injected into an agent's system prompt. They need YAML front-matter:
---
name: handoff
description: Generate a structured handoff summary to continue work in a new session
---
# Handoff
Generate a structured handoff summary...
When the agent's task matches a skill's description, the content gets appended to the system prompt. The agent now "knows" that expertise.
The power is in composition—you can load multiple skills for a single task:
delegate_task({
category: "visual-engineering",
load_skills: ["frontend-ui-ux", "playwright"],
prompt: "Build the signup form and test it",
});
// → model with design expertise AND browser testing knowledge
Permission Gates
The permission-gates.ts extension blocks dangerous commands:
const DANGEROUS_PATTERNS = [
/\brm\s+-rf?\b/,
/\bgit\s+reset\s+--hard\b/,
/\bgit\s+push\s+.*--force\b/,
/\bDROP\s+(TABLE|DATABASE)\b/i,
];
pi.on("tool_call", async (event, ctx) => {
if (isToolCallEventType("bash", event)) {
const command = event.input.command;
if (DANGEROUS_PATTERNS.some((p) => p.test(command))) {
const ok = await ctx.ui.confirm(
"Destructive command",
`Allow: ${command}?`,
);
if (!ok) return { block: true, reason: "Blocked by user" };
}
}
});
The Result
Now when I start pi:
[Skills]
user
~/.pi/agent/skills/handoff/SKILL.md
~/.pi/agent/skills/code-review/SKILL.md
~/.pi/agent/skills/debug/SKILL.md
...
[Extensions]
user
~/.pi/agent/extensions/orchestrator.ts
~/.pi/agent/extensions/permission-gates.ts
Simple questions route to Gemini Flash. Complex architecture questions go to Opus. Destructive commands require confirmation. And I can switch modes with keyboard shortcuts.
The Meta-Lesson
The hardest part of this research wasn't understanding the code. It was accepting that agents are simpler than they look.
I expected complex state machines. I found configuration objects.
I expected sophisticated routing algorithms. I found lookup tables.
I expected magic. I found prompts.
The sophistication is in two places:
- Prompt engineering — what the agent knows and how it behaves
- Model routing — matching task complexity to model capability
Once you accept that, building your own orchestrator becomes tractable.
Add a new agent? Write a markdown prompt. Add a new category? Map a string to a model. Add a new skill? Write expertise as markdown. Route to the right model? Check a lookup table.
The code just wires it together.
Resources
- pi-coding-agent — The tool I configured
- oh-my-opencode — Real-world orchestrator that inspired this
The whole thing took a a few hours of trial and error, mostly fighting with the keybindings format and finding the right model names. Worth it for the multi-model routing alone.
pi-coding-agent is by @badlogic. The orchestration pattern is inspired by how I think about delegation—cheap and fast for simple things, powerful and slow for hard things—and partly by oh-my-opencode's approach to agent orchestration and sub-agent delegation.