google · Sudhendra · Jan 17, 2026 · Jan 18, 2026 · Jan 18, 2026 · Jan 18, 2026
diff --git a/.github/CODING_AGENT_ISSUE.md b/.github/CODING_AGENT_ISSUE.md
@@ -0,0 +1,94 @@
+# GitHub Issue: CodingAgent Feature Request
+
+Use this content to create an issue at:
+https://github.com/google/adk-python/issues/new?template=feature_request.md
+
+---
+
+## Title
+
+feat(agents): Add CodingAgent (agents that think in code)
+
+---
+
+## Is your feature request related to a problem? Please describe.
+
+ADK’s current default agent interaction pattern is “tool selection from a fixed action set”. This is powerful, but it breaks down for two increasingly common workloads:
+
+1) Long-context work beyond model context windows
+- Many real tasks require operating over very large corpora: codebases, logs, datasets, multi-file configs, or long documents.
+- If the agent must keep the relevant source text and intermediate results inside the LLM context, it becomes context-window bound and expensive.
+- Recent work such as “Recursive Language Models” (arXiv:2512.24601) proposes treating long prompts as an external environment and letting the model programmatically examine/decompose/recursively process snippets. This suggests a practical direction for agents: move heavy inspection, decomposition, and intermediate state out of the prompt and into an execution environment.
+  - https://arxiv.org/abs/2512.24601
+
+2) Expressiveness and composability limits of pure tool-calling
+- Tool-calling assumes we can enumerate actions up-front. In open-ended tasks, the agent needs to compose multiple operations, iterate, cache intermediate artifacts, and implement “one-off” transformations without requiring new bespoke tools each time.
+- A code-based action space lets the agent compose operations naturally (loops, conditionals, helper functions), which reduces the need for an explosion of tools.
+
+3) Developer experience gap for building “coding agents” and sub-agent architectures
+- Users increasingly want agent systems like Claude Code / OpenCode: multi-step coding workflows with sub-agents (planner, tester, refactorer, etc.) and strong “think in code” execution.
+- ADK has strong orchestration primitives; adding a first-class code-executing agent unlocks building these systems within ADK while keeping sandboxing and tool integration.
+
+Related inspiration: HuggingFace “smolagents” positions CodeAgent as a first-class concept (“agents that think in code”) and supports sandbox backends (Docker, etc.).
+- https://github.com/huggingface/smolagents
+
+---
+
+## Describe the solution you’d like
+
+Add a new experimental agent type: CodingAgent.
+
+CodingAgent should:
+- Generate Python code as the primary action representation (in `tool_code` blocks).
+- Execute that code in a sandboxed environment (Docker-based initially).
+- Allow generated code to call ADK tools safely via an IPC bridge (e.g., HTTP) rather than exposing the host runtime directly.
+- Support iterative execution (ReAct-style loop): generate → run → observe stdout/tool results → refine → final answer.
+
+Why this solves the problem
+- Long-context: aligns with the “external environment” framing in arXiv:2512.24601 by enabling the agent to iteratively inspect, decompose, and process large inputs using code and persisted artifacts, instead of forcing all content into the model context.
+- Composability: code enables arbitrary composition (loops, conditionals, helper functions) without requiring every combination to be implemented as a first-class tool.
+- Coding-agent architectures: makes it straightforward to build higher-level workflows and multi-agent hierarchies where sub-agents can generate/run code for specialized tasks.
+
+High-level architecture
+
+User → CodingAgent (LLM) → sandbox executor (Docker Python)
+                     ↘ tool IPC server on host ↙
+
+Proposed execution environments (progressive)
+- v1: Docker Python sandbox (existing ContainerCodeExecutor integration)
+- future: REPL / Jupyter-kernel style execution modes for interactive, stateful sessions (still sandboxed)
+
+---
+
+## Describe alternatives you’ve considered
+
+1) “Just add a code-execution tool” to existing agents
+- Pros: minimal surface-area change.
+- Cons: code execution becomes an occasional tool call rather than the agent’s primary action space; harder to support tight generate→execute→iterate loops and long-context strategies that rely on an external environment.
+
+2) Require users to write bespoke tools for every operation
+- Pros: explicit and controlled.
+- Cons: does not scale; real workflows need ad-hoc transformations and composition that explode the tool surface area.
+
+3) Run code on the host interpreter
+- Pros: simplest.
+- Cons: unacceptable security risk; sandboxing is required for a general-purpose code agent.
+
+---
+
+## Additional context
+
+Future directions enabled by CodingAgent
+- Long-context scaffolds inspired by arXiv:2512.24601: treat large inputs (files, repo trees, logs) as an “environment” the agent queries/decomposes recursively using code, storing intermediate state outside the LLM context.
+- Sub-agent coding workflows (Claude Code / OpenCode style): planner/tester/refactor sub-agents coordinated by ADK, each using code execution.
+- Multiple sandbox backends (like smolagents): Docker initially, with optional future support for other sandboxes and interactive execution modes.
+
+Links
+- smolagents (inspiration): https://github.com/huggingface/smolagents
+- Recursive Language Models (long-context framing): https://arxiv.org/abs/2512.24601
+
+Labels to add
+- enhancement
+- agents
+- new-feature
+- experimental
diff --git a/.github/CODING_AGENT_PLAN.md b/.github/CODING_AGENT_PLAN.md
@@ -0,0 +1,148 @@
+# CodingAgent - Implementation Plan & Status
+
+This document tracks the implementation of CodingAgent, an experimental agent type that generates and executes Python code in sandboxed containers.
+
+## Overview
+
+CodingAgent is a ReAct-style agent that:
+- Generates Python code to solve tasks using an LLM (Gemini)
+- Executes code in sandboxed Docker containers
+- Calls ADK tools from generated code via HTTP IPC
+- Iterates until a final answer is produced
+
+## Implementation Status
+
+### Core Components ✅ Complete
+
+| Component | File | Status | Lines |
+|-----------|------|--------|-------|
+| CodingAgent | `src/google/adk/agents/coding_agent.py` | ✅ Complete | ~610 |
+| CodingAgentConfig | `src/google/adk/agents/coding_agent_config.py` | ✅ Complete | ~225 |
+| CodingAgentCodeExecutor | `src/google/adk/code_executors/coding_agent_code_executor.py` | ✅ Complete | ~505 |
+| ToolCodeGenerator | `src/google/adk/code_executors/tool_code_generator.py` | ✅ Complete | ~475 |
+| ToolExecutionServer | `src/google/adk/code_executors/tool_execution_server.py` | ✅ Complete | ~365 |
+| AllowlistValidator | `src/google/adk/code_executors/allowlist_validator.py` | ✅ Complete | ~355 |
+
+### Sample Agent ✅ Complete
+
+| File | Status | Description |
+|------|--------|-------------|
+| `contributing/samples/coding_agent/agent.py` | ✅ Complete | Data Analysis Agent (~360 lines) |
+| `contributing/samples/coding_agent/README.md` | ✅ Complete | Documentation (~290 lines) |
+| `contributing/samples/coding_agent/__init__.py` | ✅ Complete | Module init |
+
+### Unit Tests ✅ Complete
+
+| Test File | Status | Lines |
+|-----------|--------|-------|
+| `tests/unittests/agents/test_coding_agent.py` | ✅ Complete | ~310 |
+| `tests/unittests/code_executors/test_allowlist_validator.py` | ✅ Complete | ~320 |
+| `tests/unittests/code_executors/test_tool_code_generator.py` | ✅ Complete | ~320 |
+
+### Manual E2E Tests ✅ Passed
+
+| Test Scenario | Status | Notes |
+|--------------|--------|-------|
+| Basic math query ("What is 25 * 17?") | ✅ Passed | Returns 425 |
+| Data analysis (Titanic survival rate) | ✅ Passed | Returns 38.38% |
+| Visualization (bar chart by class) | ✅ Passed | Chart saved to host |
+| Multi-step analysis | ✅ Passed | Stats + visualization + insights |
+| Tool calling via HTTP IPC | ✅ Passed | fetch_url, save_chart work |
+| Error handling (pip warnings) | ✅ Passed | Ignores non-fatal stderr |
+| Chart saving to host system | ✅ Passed | Saved to /tmp/adk_charts/ |
+
+## Architecture
+
+```
+┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
+│   User Query    │────▶│   CodingAgent    │────▶│ Docker Container│
+│                 │     │  (Gemini LLM)    │     │ (Python 3.11)   │
+└─────────────────┘     └──────────────────┘     └─────────────────┘
+                               │                         │
+                               │                         │ Executes
+                               ▼                         │ generated code
+                        ┌──────────────┐                 │
+                        │ Tool Server  │◀────────────────┘
+                        │ (HTTP IPC)   │  Tool calls via HTTP
+                        └──────────────┘
+```
+
+### How Tool IPC Works
+
+1. CodingAgent starts ToolExecutionServer on host (port 8765)
+2. Code is generated with tool stubs that make HTTP POST requests
+3. Container reaches host via `host.docker.internal` (macOS/Windows) or bridge gateway (Linux)
+4. Tool server executes actual tool functions with proper context
+5. Results returned to container via HTTP response
+
+## Key Design Decisions
+
+| Decision | Choice | Rationale |
+|----------|--------|-----------|
+| Container image | `python:3.11-slim` + runtime pip | Simpler for users, no custom Dockerfile |
+| Tool communication | HTTP IPC | Works across container boundary, secure |
+| Import validation | Allowlist-based | Security without blocking legitimate use |
+| Chart saving | `save_chart` tool | Transfers data to host filesystem |
+| Error handling | Distinguish warnings from errors | pip warnings shouldn't fail execution |
+
+## Sample Agent: Data Analyst
+
+### Tools Available
+
+| Tool | Description |
+|------|-------------|
+| `fetch_url(url)` | Fetch CSV/JSON/text from URLs |
+| `get_sample_datasets()` | List available datasets (Titanic, Iris, Tips) |
+| `get_current_time()` | Get current timestamp |
+| `save_chart(image_data, filename)` | Save base64 chart to host |
+| `list_saved_charts()` | List saved charts |
+
+### Example Queries
+
+1. "What is the survival rate on the Titanic?"
+2. "Create a bar chart showing survival rate by passenger class"
+3. "Analyze the iris dataset and create a scatter plot colored by species"
+4. "Perform comprehensive analysis: stats, survival rates, visualization, insights"
+
+## Files Changed Summary
+
+```
+ .github/CODING_AGENT_PLAN.md                       | Plan document
+ contributing/samples/coding_agent/README.md        | 290 lines
+ contributing/samples/coding_agent/__init__.py      | 17 lines
+ contributing/samples/coding_agent/agent.py         | 360 lines
+ src/google/adk/agents/__init__.py                  | +2 exports
+ src/google/adk/agents/coding_agent.py              | 610 lines
+ src/google/adk/agents/coding_agent_config.py       | 225 lines
+ src/google/adk/code_executors/__init__.py          | +6 exports
+ src/google/adk/code_executors/allowlist_validator.py    | 355 lines
+ src/google/adk/code_executors/coding_agent_code_executor.py | 505 lines
+ src/google/adk/code_executors/tool_code_generator.py    | 475 lines
+ src/google/adk/code_executors/tool_execution_server.py  | 365 lines
+ tests/unittests/agents/test_coding_agent.py        | 310 lines
+ tests/unittests/code_executors/test_allowlist_validator.py | 320 lines
+ tests/unittests/code_executors/test_tool_code_generator.py | 320 lines
+```
+
+**Total: ~4,200 lines of new code**
+
+## PR Checklist
+
+- [x] Implementation complete
+- [x] Unit tests written and passing
+- [x] Manual E2E tests passing
+- [x] Sample agent created with README
+- [x] Code follows ADK style guide (relative imports, `from __future__ import annotations`)
+- [x] Marked as `@experimental`
+- [ ] Run `./autoformat.sh` before PR
+- [ ] Run full test suite: `pytest tests/unittests`
+- [ ] Create GitHub issue (see `.github/CODING_AGENT_ISSUE.md`)
+- [ ] Submit PR with testing plan
+
+## Future Enhancements (Out of Scope)
+
+- Stateful execution (persist variables across turns)
+- Custom container images with pre-installed packages
+- VertexAI code execution integration
+- Support for JavaScript/TypeScript
+- Streaming output during execution