Author: Scott Senatore, Semantic Infrastructure Lab
Date: 2025-12-04
Type: Founder's Note / Blog Post
Status: Draft
The Static Prompt Problem
You're building an AI agent system. You want your agent to use tools effectively - run commands, call APIs, query databases. So you do the obvious thing: you write examples into the system prompt.
SYSTEM_PROMPT = """
You are an AI assistant with access to these tools:
reveal <file> - Show code structure
Example: reveal app.py
Example: reveal src/utils.py get_config
search <pattern> - Find files
Example: search "def main"
Example: search "*.py" | grep "import"
... (50 more tools with examples)
"""
This seems reasonable. Give the agent examples up front, and it will know how to use the tools.
But it's fundamentally broken.
Why Static Prompts Fail
Problem 1: Examples Go Stale
You ship reveal v0.13.0 with your examples. Three months later, reveal v0.15.0 adds:
- reveal help:// - Self-documenting system
- reveal --agent-help-full - Comprehensive workflows
- reveal --check - Code quality scanning
- reveal 'ast://src?complexity>10' - AST queries
Your agent still thinks it's using v0.13.0. It never discovers these features because they're not in the static prompt.
Problem 2: Prompt Bloat
You have 50 tools. Each tool has 5 example patterns. That's 250 examples in your system prompt.
At ~100 tokens per example, that's 25,000 tokens of examples loaded into every conversation before the user even says hello.
Cost: $25 per 1M tokens = $0.625 per conversation just for examples that might not even be used.
Problem 3: No Context Adaptation
User asks: "Find all complex functions in the codebase."
Your static examples show:
reveal app.py
reveal src/utils.py get_config
But the right pattern for this query is:
reveal 'ast://src?complexity>10' --format=json
This isn't in your examples. Because you wrote examples for basic usage, not advanced queries. Now you need to add MORE examples, which makes Problem 2 worse.
The Insight: Dynamic Documentation = Multi-Shot Learning
Here's the key realization: What if the agent could request examples on-demand?
Instead of:
[System Prompt with 25K tokens of examples]
User: "Find complex functions"
Agent: [tries to match static examples]
Do this:
[Minimal system prompt: "Request --agent-help before using tools"]
User: "Find complex functions"
Agent: reveal --agent-help-full
[Gets fresh, comprehensive examples]
Agent: [uses correct pattern from latest docs]
This is multi-shot learning.
The ML Parallel
In machine learning:
- Zero-shot: No examples, just task description
- One-shot: Single example provided
- Few-shot: Handful of examples (2-5 typically)
For AI agents:
- Static prompt: Fixed examples, loaded once
- Dynamic help: Request examples when needed
- Multi-shot: Unlimited fresh examples on-demand
Dynamic documentation is the agent equivalent of multi-shot learning.
The Pattern: --agent-help as a Standard
What Makes Good Agent Help?
SIL's --agent-help specification:
- Purpose - What does this tool do? (1-2 sentences)
- Syntax - How do you invoke it? (basic form)
- Examples - Real usage patterns (REQUIRED - not optional!)
- Workflows - Common task compositions
- Pro Tips - Advanced usage, gotchas, when to use what
Example from reveal:
$ reveal --agent-help-full
## Core Purpose
Token-efficient code exploration. See structure before reading entire files.
Reduces token usage 10-150x for code analysis tasks.
## Basic Usage
reveal <file> # Structure overview (50 tokens vs 7500)
reveal <file> <function> # Extract specific element
reveal <file> --check # Code quality scan
## Advanced Examples
### Progressive Disclosure
reveal app.py --head 10 # First 10 elements (unknown file)
reveal app.py --range 20-30 # Elements 20-30 (large file)
reveal app.py --outline # Hierarchical view
### Code Quality Queries
reveal src/ --check --select E,W # Errors + warnings only
reveal 'ast://src?complexity>10' # Find complex functions
reveal 'ast://app.py?lines>50' # Long functions
### Pipeline Composition
git diff --name-only | reveal --stdin
find src/ -name "*.py" | reveal --stdin --check
reveal 'ast://src/' --format=json | jq '.results[] | .name'
## Workflows
### Unknown Codebase Exploration
1. reveal src/ --head 5 # Get initial structure
2. reveal 'ast://src?complexity>10' # Find complex areas
3. reveal src/core.py main # Extract key function
4. reveal src/core.py --check # Quality check
### Refactoring Candidates
1. reveal 'ast://src?lines>100' # Long functions
2. reveal 'ast://src?complexity>8' # Complex functions
3. Intersect results → prioritize refactoring
## Pro Tips
- Use --head/--range for large files (token efficient)
- --format=json enables pipeline composition
- --check integrates 24 quality rules (flake8 subset)
- ast:// queries support >, <, >=, <=, == operators
- Multiple filters combine with & (AND logic)
## Related Commands
reveal help:// # List all help topics
reveal help://ast # AST query deep dive
reveal --list-supported # Supported file types
This is what the agent sees. Fresh, comprehensive, with real examples.
Real-World Evidence: This Works
Case Study 1: TIA Command Discovery
Before --help discipline:
- Agent tried wrong commands: tia session find (doesn't exist)
- Guessed wrong syntax: reveal ast://src?complex>10 (should be complexity>10)
- Missed features: tia project show <name> (never discovered)
- Token waste: Trial and error across multiple attempts
After --help requirement:
- Agent checks: tia session --help → sees search subcommand
- Agent reads: reveal help://ast → learns correct operators
- Agent discovers: tia project --help → finds show command
- Token efficient: Gets it right first time
Measured impact: 20-40% token reduction in command-heavy sessions.
Case Study 2: Reveal Evolution (v0.13 → v0.15)
Static prompt approach:
# Agent's knowledge (frozen at v0.13)
reveal app.py # Only knows basic usage
reveal app.py function # Only knows extraction
Dynamic help approach:
# Agent requests fresh docs (v0.15)
$ reveal --agent-help-full
# Discovers NEW features (added in v0.14-v0.15):
reveal help:// # Self-documenting (v0.15)
reveal --agent-help-full # This command! (v0.15)
reveal 'ast://src?complexity>10' # AST queries (v0.15)
reveal --check # Quality scans (v0.14)
reveal --stdin # Pipeline mode (v0.14)
The agent automatically learns new features as tools evolve.
Case Study 3: Scout Research Agent
Scout uses Groqqy (agent framework) with 20+ tool functions. Each tool has complex usage patterns.
Approach: Every tool provides --agent-help equivalent (structured docstrings with examples).
Pattern:
@tool
def reveal_structure(path: str) -> str:
"""
Token-efficient code structure exploration.
Examples:
reveal_structure("src/app.py")
reveal_structure("src/")
Advanced:
Use for: Unknown files, large files, token budget constraints
Avoid: When you need full implementation details
Returns: Structure overview (~50 tokens vs ~7500 for full file)
"""
# Implementation...
Result: Scout's multi-phase research orchestrator completes complex analysis with 75% automation. When a phase fails, reading tool help reveals correct usage patterns.
Why This Matters for Agent Systems
1. Tools Evolve Faster Than Prompts
Software reality: Tools ship updates weekly (bug fixes, features, breaking changes).
Prompt reality: System prompts update quarterly (manual human process).
Gap: Your agent is always operating on stale information unless it can request fresh docs.
2. Prompt Token Budgets Are Precious
Current LLM economics:
- Input: $3-15 per 1M tokens (depending on model)
- Output: $15-75 per 1M tokens
Static approach: 25K tokens of examples in every prompt (regardless of which tools are used).
Dynamic approach: ~200 tokens to request help, ~2K tokens for relevant help.
Savings: 90%+ reduction when only 1-2 tools are used per session.
3. Context-Adaptive Learning
Static examples can't predict use cases.
Example: You ship reveal with basic examples. User wants to:
- Find all functions with cyclomatic complexity > 10
- Filter by lines of code
- Output as JSON for pipeline processing
- Composition with jq
Your static examples don't cover this. But reveal help://ast does, because it's comprehensive documentation designed for discovery.
Agent help enables exploration - not just execution.
4. Self-Documenting Systems Scale
As your tool ecosystem grows:
- 5 tools × 5 examples = 25 examples (manageable in prompt)
- 50 tools × 5 examples = 250 examples (prompt bloat)
- 500 tools × 5 examples = 2500 examples (impossible)
Static prompts don't scale. Dynamic help does.
The Agent Help Standard (SIL Spec)
Requirements for --agent-help
All SIL-compliant tools MUST provide:
<tool> --agent-help # Agent-optimized quick reference
<tool> --agent-help-full # Comprehensive guide with workflows
Content requirements:
1. Purpose statement (what/why in 1-2 sentences)
2. Basic syntax (minimal invocation pattern)
3. Real examples (3-5 common use cases) ← REQUIRED
4. Advanced examples (2-3 power-user patterns)
5. Workflow examples (task-oriented compositions)
6. Pro tips (gotchas, when to use, when not to use)
7. Related commands (what to use next)
Why examples are REQUIRED:
- Syntax alone is ambiguous: tool <path> [options] (what's valid?)
- Examples disambiguate: tool src/ --recursive --format=json
- Workflows show composition: git diff | tool --stdin | jq
Implementation Patterns
Command-line tools (Bash/Python):
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--agent-help', action='store_true',
help='Show agent-optimized help')
parser.add_argument('--agent-help-full', action='store_true',
help='Show comprehensive agent guide')
if args.agent_help:
print(load_agent_help_quick())
sys.exit(0)
if args.agent_help_full:
print(load_agent_help_full())
sys.exit(0)
Self-documenting systems (reveal's approach):
# Help as a first-class URI scheme
tool help:// # List all topics
tool help://topic # Specific topic deep-dive
tool help://adapters # Category overview
Structured tool definitions (Groqqy/agent frameworks):
@tool(
name="reveal_structure",
description="Token-efficient code exploration",
agent_help="""
Purpose: See code structure before reading full file
Examples:
reveal_structure("app.py") # Basic usage
reveal_structure("src/") # Directory scan
Use when: Unknown file, large file, token budget tight
Avoid when: Need full implementation details
"""
)
def reveal_structure(path: str) -> str:
# Implementation
Adoption Checklist
If you're building agent systems, adopt this pattern:
For Tool Developers
- [ ] Add
--agent-helpflag to all tools - [ ] Include 5+ real examples (not just syntax)
- [ ] Add workflow examples (composition patterns)
- [ ] Document pro tips (gotchas, edge cases)
- [ ] Keep help fresh (update with features)
For Agent Developers
- [ ] Update system prompt: "Request --agent-help before using unfamiliar tools"
- [ ] Remove static examples (or minimize to core 3-5 tools)
- [ ] Measure token savings (compare before/after)
- [ ] Track help request patterns (which tools need better docs?)
- [ ] Iterate on prompt clarity ("always check help" vs "read docs first")
For LLM Providers
- [ ] Add
--agent-helpto model documentation standards - [ ] Provide tool developers with template/spec
- [ ] Measure help request rates (good metric for agentic usage)
- [ ] Optimize tokenization for help output (structured content)
Measuring Success
How do you know this is working?
Quantitative Metrics
Token efficiency:
Token_savings = (Static_prompt_tokens - Dynamic_help_tokens) / Static_prompt_tokens
Example:
Static: 25,000 tokens (50 tools × 500 tokens each)
Dynamic: 2,500 tokens (5 help requests × 500 tokens)
Savings: 90%
Success rate:
Tool_usage_success = Correct_invocations / Total_invocations
Before --agent-help: 65% (lots of trial-and-error)
After --agent-help: 92% (gets it right first time)
Feature discovery:
Feature_utilization = Advanced_features_used / Advanced_features_available
Static prompt: 20% (only features with examples)
Dynamic help: 75% (discovers through exploration)
Qualitative Indicators
Good signs:
- Agent requests help before first use ✅
- Agent discovers advanced features (not just basic examples) ✅
- Agent composes tools in novel ways (learns from workflow examples) ✅
- Tool updates automatically propagate to agent behavior ✅
Bad signs:
- Agent tries commands without checking help ❌
- Agent guesses syntax and fails ❌
- Agent never discovers advanced features ❌
- Agent behavior doesn't change when tools update ❌
Common Objections (And Rebuttals)
"But calling --help adds latency!"
Reality check:
- Help request: ~100ms (local command)
- LLM round-trip: 500-2000ms (network + generation)
- Trial-and-error (no help): 3-5 round-trips = 1.5-10 seconds
Math: 100ms upfront << 1.5-10 seconds of guessing wrong.
Also: Cache help output (tools don't change mid-session).
"My static examples are really good!"
That's great! But:
- How often do you update them? (Tools change weekly, prompts change quarterly)
- How comprehensive are they? (Can't cover every use case in 5 examples)
- What's the token cost? (25K static vs 2K dynamic)
- What happens when you add Tool #51? (Prompt bloat)
Static examples are great for the 3-5 most critical tools. Everything else should be dynamic.
"Agents should just figure it out"
This is like saying: "Developers should just figure out APIs without documentation."
Would you use a library with no docs? No examples? No API reference?
Tools without help = unusable for agents.
"Won't agents abuse help requests?"
Possible, but unlikely if prompt is clear:
- "Request --agent-help BEFORE FIRST USE of a tool"
- "Cache help output - don't request again in same session"
- "Only request help if unfamiliar or syntax unclear"
In practice: Agents are conservative (token-conscious). They request help once per tool, then cache it.
The Future: Self-Documenting Everything
Imagine a world where:
Every CLI tool has --agent-help:
git --agent-help
docker --agent-help
kubectl --agent-help
npm --agent-help
Every API has agent-friendly docs:
curl api.example.com/agent-docs
Every LLM tool has structured help:
@tool(agent_help="...")
def my_function():
pass
Agents would:
- Discover tools through exploration (not static lists)
- Learn usage patterns on-demand (not preloaded examples)
- Adapt to tool updates automatically (fresh docs every time)
- Compose tools creatively (workflow examples inspire novel combinations)
This is the vision: Self-documenting infrastructure where agents learn through exploration, not memorization.
Call to Action
If you're building tools for agents:
1. Add --agent-help to your tools TODAY
2. Include real examples (not just syntax)
3. Keep it fresh (update with every release)
If you're building agent systems:
1. Update your system prompt: "Request help before using tools"
2. Remove static examples (or minimize to core tools)
3. Measure token savings and success rates
If you're an LLM researcher:
1. Study the dynamic help pattern (this is multi-shot learning for agents)
2. Build benchmarks comparing static vs dynamic documentation
3. Contribute to agent help standards
Conclusion: Knowledge On-Demand
The insight:
Static prompts are one-shot learning. Dynamic documentation is multi-shot learning.
The pattern:
Teach agents to request --agent-help before using tools.
The evidence:
20-90% token savings, higher success rates, automatic feature discovery.
The future:
Self-documenting infrastructure where agents learn through exploration.
This changes everything.
Static prompts were the right solution in 2022 when we had 4K context windows and no tool calling.
In 2025, with 200K+ context windows and native tool support, dynamic documentation is obviously better.
It's time to move from one-shot to multi-shot agent learning.
Author: Scott Senatore
Organization: Semantic Infrastructure Lab
Contact: [TBD]
License: CC BY 4.0
Related Work:
- Semantic Feedback Loops (SIL canonical doc)
- Multi-Agent Protocol Principles (SIL canonical doc)
- Reveal --agent-help implementation (reference implementation)
- TIA command help system (production deployment)
Appendix: Template for Agent Help
Use this template for your tools:
## <Tool Name> - Agent Help
### Purpose
[1-2 sentence description of what this tool does and why it exists]
### Basic Usage
<tool> <required_args> [optional_flags]
### Examples
#### Common Use Cases
<tool> example1 # Description
<tool> example2 --flag # Description
<tool> example3 input.txt # Description
#### Advanced Patterns
<tool> complex_example --advanced --flags=value
<tool> pipeline | another_tool | third_tool
#### Error Handling
<tool> --validate input # Check before processing
<tool> --dry-run # Preview without executing
### Workflows
#### Task: [Common Task Name]
1. <tool> step1
2. <tool> step2
3. <tool> step3
Result: [What you achieve]
#### Task: [Another Common Task]
1. <tool> different_approach
2. <tool> next_step
Result: [What you achieve]
### Pro Tips
- Use FLAG when CONDITION (saves time/tokens/complexity)
- Avoid PATTERN in SITUATION (common mistake)
- Combine with TOOL for BENEFIT (composition pattern)
- Check OUTPUT for SIGNAL (debugging tip)
### Related Commands
<related_tool1> - [When to use instead]
<related_tool2> - [When to use after]
<related_tool3> - [When to use with]
### Version
This help is for <tool> v<version>
Updated: <date>
Fill in the template. Ship with your tool. Change the game.