What is Claude Code Mode
Traditional Tool Calling Agents:
Input → LLM → loop(Tool Call → Execute Tool → Result → LLM) → output
Claude Code Mode Agent:
Input → loop(Claude Codes w/ access to your tools) → Execute Code → output
Why it's better
Each time tool output is passed to the LLM, and output is introduced, it leaves some possibility of mistake. As the number of iterations your agent does n increases, the better Code Mode gets compared to traditional iterative tool calling agents.
LLMs are better at generating code files & verifying & running that code they generate than calling tools to create output.
How It Works
Let's say you have an Architect AI that is supposed to verify that blueprints created by the company all satisfy the required documentation before a product is shipped.
┌─────────────────────────────────────────────────────────────────┐
│ Your Application │
│ │
│ agent = Agent('claude-sonnet-4-5-20250929') │
│ │
│ @agent.tool │
│ def search_docs(query: str) -> list[database_rows]: ... │
│ │
│ @agent.tool │
│ def analyze_blueprints(file_location: str) -> dict: ... │
│ │
│ @agent.tool │
│ def report_blueprint(pdf_location: str) -> None: ... │
│ │
│ result = agent.codemode("verify structural integrity") │
│ │ │
└─────────────┼───────────────────────────────────────────────────┘
│
│ 1. Extract tools from agent
▼
┌─────────────────────────────────────────────────────────────────┐
│ Claude Codemode │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ 2. Generate agentRunner.py with tool definitions │ │
│ │ │ │
│ │ def search_docs(query: str) -> list[database_rows]: │ │
│ │ """Search documentation, returns database rows.""" │ │
│ │ "... implementation ..." │ │
│ │ │ │
│ │ def analyze_blueprints(file_location: str) -> dict: │ │
│ │ """Analyze blueprints for issues.""" │ │
│ │ "... implementation ..." │ │
│ │ │ │
│ │ def report_blueprint(pdf_location: str) -> None: │ │
│ │ """Report a blueprint for failing standards.""" │ │
│ │ "... implementation ..." │ │
│ │ │ │
│ │ def main(params: str = "verify structural integrity") │ │
│ │ "TODO: Implement task using above tools" │ │
│ │ pass │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
└──────────────────────────┼──────────────────────────────────────┘
│ 3. Spawn Claude Code
▼
┌─────────────────────────────────────────────────────────────────┐
│ Claude Code │
│ │
│ Instructions: │
│ "Implement the main() function to accomplish the task. │
│ Use the provided tools by writing Python code." │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Claude reads agentRunner.py │ │
│ │ Claude writes implementation: │ │
│ │ │ │
│ │ def find_database_files_on_disk(database_row): │ │
│ │ "Claude wrote this helper function!" │ │
│ │ return database_row['file'].download().path │ │
│ │ │ │
│ │ def main(): │ │
│ │ "Search docs - returns database rows" │ │
│ │ database_rows = search_docs("structural integrity")│ │
│ │ │ │
│ │ "Analyze each blueprint and report failures" │ │
│ │ analyses = [] │ │
│ │ for row in database_rows: │ │
│ │ file_path = find_database_files_on_disk(row) │ │
│ │ result = analyze_blueprints(file_path) │ │
│ │ analyses.append(result) │ │
│ │ if not result.passes: │ │
│ │ report_blueprint(file_path) │ │
│ │ │ │
│ │ return {"analyses": analyses} │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ │ 4. Execute agentRunner.py │
│ ▼ │
│ ┌────────────────────────┐ │
│ │ python agentRunner.py │ │
│ └────────────────────────┘ │
│ │ │
└──────────────────────────┼──────────────────────────────────────┘
│ 5. Return result
▼
┌─────────────────────────────────────────────────────────────────┐
│ CodeModeResult │
│ │
│ { │
│ "output": {...}, │
│ "success": true, │
│ "execution_log": "..." │
│ } │
└─────────────────────────────────────────────────────────────────┘
More Specifically Why It's Better
This problem has a lot of elements of what an agent excels at: some ambiguity, needing to integrate pieces together, formatting a nice output.
I tried to also exemplify claude code's ability to figure out how to piece any two pieces of software together on the fly, it figures out here how to grab database_row['file'].download().path
and puts that in a function. This is something that I feel like is more likely to be figured out by claude code rather than any other way to make a tool that would try to do the same.
As the amount of functions needed to be called by the agents becomes very large, then the amount of advantage that the Code Mode Agent has keeps increasing. The best LLM agents today can call tools in parallel, but it's difficult for them to call > 10 tools in parallel. They can call tools in serially, but after 20 iterations, they lose track of where they are even with a todo list.
With Code Mode Agent, you can call an essentially unlimited amount of functions provided to you by the original agent definition, without waiting for a ridiculous number of agent iterations because of the serial or parallel limitations of agents.
My open source library that creates code mode agents, claude_codemode
https://github.com/ryx2/claude-codemode
from pydantic_ai import Agent
from claude_codemode import codemode
# Create an agent with tools
agent = Agent('claude-sonnet-4-5-20250929')
@agent.tool
def get_weather(city: str) -> str:
"""Get weather for a city."""
return f"Weather in {city}: Sunny, 72°F"
@agent.tool
def calculate_temp_diff(temp1: str, temp2: str) -> str:
"""Calculate temperature difference."""
import re
t1 = int(re.search(r'(\d+)', temp1).group(1))
t2 = int(re.search(r'(\d+)', temp2).group(1))
return f"{abs(t1 - t2)}°F"
# Use codemode instead of run
result = codemode(
agent,
"Compare weather between San Francisco and New York"
)
print(result.output)
Inspirations
- Cloudflare's blog post introducing the code mode concept
- Theo's t3 chat video for making me aware of this approach
- Early MCP implementation by jx-codes