CLI Delegation Guide

Tier 2 | Practical guide for delegating tasks to external CLIs Hub: README.md | Routing Architecture: ROUTING_SYSTEM.md


Overview

This guide covers when and how to delegate tasks to Claude, Gemini, Codex, or OpenCode CLIs. Based on real testing with measured performance metrics.


Decision Flowchart

Task arrives

    ├─ Large codebase analysis (>500 files)?
    │       └─ YES → Gemini (1M context)

    ├─ Code generation / boilerplate?
    │       └─ YES → Codex (3-6s latency)

    ├─ Complex reasoning / architecture?
    │       └─ YES → Claude (highest quality)

    ├─ Custom model / OpenAI-compatible endpoint?
    │       └─ YES → OpenCode (gateway routing)

    └─ Simple query / quick task?
            └─ Gemini (cost-effective)

CLI Profiles

Gemini CLI

Strengths: Massive context window, fast codebase analysis, cost-effective.

MetricMeasured ValueNotes
Latency10-73sVaries with context size
Max Context1M tokens~978 files in single query
Best ForAnalysisCode review, exploration
CostLowFree tier available

Invocation:

# Non-interactive prompt mode
gemini -p "Analyze this codebase for security issues"

# With file context
gemini -p "Review these files: $(cat file_list.txt)"

JSON Output Parsing:

Gemini output may include markdown fencing. Extract JSON:

function parseGeminiOutput(output: string): unknown {
  // Strip markdown code fences if present
  const jsonMatch = output.match(/```(?:json)?\s*([\s\S]*?)```/);
  const jsonStr = jsonMatch ? jsonMatch[1].trim() : output.trim();
  return JSON.parse(jsonStr);
}

Codex CLI

Strengths: Fast code generation, optimized for implementation tasks.

MetricMeasured ValueNotes
Latency3-6sConsistently fast
Max Context128K tokensSufficient for most tasks
Best ForGenerationBoilerplate, implementations
CostMediumPer-token pricing

Invocation:

# Non-interactive execution mode (required for automation)
codex exec "Implement a rate limiter class in TypeScript"

# With specific output format
codex exec "Generate Jest tests for auth.ts" --format code

Important: Always use codex exec for non-interactive mode. The default codex command expects interactive input.

Claude CLI

Strengths: Highest reasoning quality, best for complex decisions.

MetricMeasured ValueNotes
Latency5-30sDepends on task complexity
Max Context200K tokensSufficient for most tasks
Best ForReasoningArchitecture, complex logic
CostHigherPremium for quality

Invocation:

# Standard prompt
claude "Design the authentication flow for this system"

# With file context
claude "Review this PR" --files "src/**/*.ts"

OpenCode CLI

Strengths: OpenAI-compatible gateway, custom model routing, MCP integration.

MetricMeasured ValueNotes
Latency5-30sDepends on provider/model
Max ContextModel-dependentUses configured provider’s limits
Best ForCustom modelsOpenAI-compatible endpoints
CostVariesDepends on configured provider

Invocation:

# Version check
opencode --version

# Interactive mode (default)
opencode

Configuration: OpenCode uses opencode.json for MCP server and provider configuration:

{
  "mcp": {
    "nexus-agents": {
      "type": "local",
      "command": ["node", "dist/cli.js", "--mode=server"]
    }
  }
}

OpenCode supports custom OpenAI-compatible endpoints via its provider configuration, enabling routing through any API gateway. See CUSTOM_ENDPOINT_SETUP.md for details.


Task-CLI Matching Matrix

Task TypePrimary CLIFallbackRationale
Codebase explorationGeminiClaude1M context handles large repos
Security auditClaudeGeminiReasoning quality matters
Boilerplate generationCodexClaudeSpeed + code focus
Architecture decisionsClaude-No substitute for reasoning
Test generationCodexClaudePattern matching sufficient
Documentation writingClaudeGeminiQuality prose needed
Quick questionsGeminiCodexCost-effective
Refactoring suggestionsClaudeGeminiContext understanding needed
Custom model routingOpenCodeClaudeOpenAI-compatible endpoints

Integration with nexus-agents

Using the CompositeRouter

The routing system automatically selects the optimal CLI:

import { createCompositeRouter } from 'nexus-agents';

const router = createCompositeRouter({
  enableBudgetFilter: true,
  enableTopsisRanking: true,
  enableLinUCBSelection: true,
});

const decision = await router.route({
  description: 'Analyze codebase for performance issues',
  contextTokens: 50000,
  constraints: { maxLatencyMs: 30000 },
});

// decision.cliName: 'gemini' | 'claude' | 'codex'
// decision.confidence: 0.0-1.0
// decision.reason: "Selected gemini: large context (50k tokens) favors 1M window"

Manual Delegation via Subprocess

For direct CLI invocation:

import { execSync } from 'child_process';

function delegateToGemini(prompt: string): string {
  const result = execSync(`gemini -p "${prompt.replace(/"/g, '\\"')}"`, {
    encoding: 'utf-8',
    timeout: 120000, // 2 minute timeout
  });
  return result;
}

function delegateToCodex(prompt: string): string {
  const result = execSync(`codex exec "${prompt.replace(/"/g, '\\"')}"`, {
    encoding: 'utf-8',
    timeout: 30000, // 30 second timeout
  });
  return result;
}

Routing Audit

Debug routing decisions before execution:

nexus-agents routing-audit "Implement sorting algorithm" --format=json

Output shows:

  • Task profile (complexity, context requirements)
  • Budget filter results
  • TOPSIS scores per CLI
  • Final selection with reasoning

Troubleshooting

Gemini Issues

Problem: Output contains markdown instead of raw JSON.

Solution: Parse with fence stripping (see JSON Output Parsing above).

Problem: Timeout on large context.

Solution: Increase timeout to 120s. For very large contexts (>500K tokens), consider chunking.

Codex Issues

Problem: Command hangs waiting for input.

Solution: Use codex exec instead of codex. The exec subcommand runs non-interactively.

Problem: Truncated output.

Solution: Check --max-tokens flag. Default may be low for large generations.

Claude Issues

Problem: Rate limited.

Solution: Check ANTHROPIC_API_KEY tier. Implement exponential backoff:

async function withRetry<T>(fn: () => Promise<T>, maxRetries = 3): Promise<T> {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (error) {
      if (i === maxRetries - 1) throw error;
      await new Promise((r) => setTimeout(r, 1000 * Math.pow(2, i)));
    }
  }
  throw new Error('Unreachable');
}

General Issues

Problem: CLI not found.

Solution: Verify installation:

which gemini && gemini --version
which codex && codex --version
which claude && claude --version

Problem: Wrong CLI selected.

Solution: Use routing-audit to understand selection. Override with explicit budget constraints:

router.route({
  description: 'task',
  constraints: { maxLatencyMs: 5000 }, // Forces fast CLI
});

Best Practices

  1. Prefer automatic routing - Let CompositeRouter decide unless you have specific requirements.

  2. Set appropriate timeouts - Gemini: 120s, Codex: 30s, Claude: 60s.

  3. Handle JSON parsing - Always strip potential markdown fencing from Gemini output.

  4. Use exec mode - Codex requires exec subcommand for non-interactive use.

  5. Monitor costs - Enable budget filtering to prevent runaway spending.

  6. Log routing decisions - Record decision.reason for debugging.