feat(agents): Add CodingAgent (agents that think in code) #4262

Sudhendra · 2026-01-26T00:13:45Z

Please ensure you have read the contribution guide before creating a pull request.**

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

Closes: feat(agents): Add CodingAgent (agents that think in code) #4198

2. Or, if no issue exists, describe the change:

N/A - This PR implements the feature request described in issue #4198.

Description

Problem:
ADK's current default agent interaction pattern is "tool selection from a fixed action set." This breaks down for several increasingly common workloads:

Long-context work beyond model context windows: Many real tasks require operating over very large corpora (codebases, logs, datasets, multi-file configs). If the agent must keep all relevant content in the LLM context, it becomes context-window bound and expensive.
Expressiveness and composability limits of pure tool-calling: Tool-calling assumes enumerable actions. Open-ended tasks require composing operations, iterating, caching intermediate artifacts, and implementing one-off transformations without requiring new bespoke tools each time.
Developer experience gap for building "coding agents": Users want agent systems like Claude Code/OpenCode with multi-step coding workflows, sub-agents, and strong "think in code" execution capabilities.

Solution:
Introduce a new experimental agent type: CodingAgent - an agent that generates and executes Python code as its primary action representation.

Key features:

ReAct-style execution loop: Generate code → Execute → Observe results → Refine → Final answer
Sandboxed execution: Code runs in Docker containers via ContainerCodeExecutor for security
Tool integration via HTTP IPC: Generated code can call ADK tools through a ToolExecutionServer running on the host
Import validation: AllowlistValidator ensures only authorized imports are allowed
Stateful execution: Optional state persistence across iterations
Full telemetry: OpenTelemetry spans for code generation, execution, and LLM calls

Architecture:

User → CodingAgent (LLM) → Docker Container (Python sandbox)
                               ↓
                         Tool Server (HTTP IPC on host)
                               ↓
                         ADK Tools with ToolContext

New files introduced:

src/google/adk/agents/coding_agent.py - Main CodingAgent class
src/google/adk/agents/coding_agent_config.py - Pydantic configuration
src/google/adk/code_executors/coding_agent_code_executor.py - Executor wrapper with tool injection
src/google/adk/code_executors/tool_execution_server.py - FastAPI server for tool IPC
src/google/adk/code_executors/tool_code_generator.py - System prompt and stub generation
src/google/adk/code_executors/allowlist_validator.py - Import validation
contributing/samples/coding_agent/ - Sample data analysis agent with documentation

Testing Plan

Unit Tests:

I have added or updated unit tests for my change.
All unit tests pass locally.

Test files added:

tests/unittests/agents/test_coding_agent.py - Tests for CodingAgent, CodingAgentConfig, CodingAgentState
tests/unittests/code_executors/test_tool_code_generator.py - Tests for prompt generation, tool stubs, runtime header
tests/unittests/code_executors/test_allowlist_validator.py - Tests for import validation

Test coverage includes:

CodingAgentConfig default values and validation (max_iterations bounds, port bounds)
CodingAgentState serialization and history tracking
CodingAgent creation with default and custom configurations
Code block extraction (tool_code and python blocks, preference order)
Error feedback formatting
Tool resolution from functions and BaseTool instances
Tool stub generation with type hints and docstrings
Runtime header generation with trace collection
System prompt generation with tool documentation
Import allowlist validation

pytest summary:

# Run all CodingAgent tests
pytest tests/unittests/agents/test_coding_agent.py -v

# Run tool code generator tests
pytest tests/unittests/code_executors/test_tool_code_generator.py -v

# Run allowlist validator tests
pytest tests/unittests/code_executors/test_allowlist_validator.py -v

# Run all related tests
pytest tests/unittests/agents/test_coding_agent.py tests/unittests/code_executors/test_tool_code_generator.py tests/unittests/code_executors/test_allowlist_validator.py -v

Manual End-to-End (E2E) Tests:

Prerequisites:

Docker installed and running
GOOGLE_API_KEY environment variable set

Test the sample agent:

# Interactive CLI mode
adk run contributing/samples/coding_agent

# Web UI mode
adk web contributing/samples
# Navigate to http://localhost:8000 and select coding_agent

Example test interactions:

Basic data analysis:

User: What is the survival rate on the Titanic dataset?
Expected: Agent fetches Titanic CSV, analyzes it with pandas, returns ~38.4% survival rate

Visualization with chart saving:

User: Create a bar chart showing survival rate by passenger class on the Titanic
Expected: Agent creates matplotlib chart, saves via save_chart tool to /tmp/adk_charts/

Multi-step analysis:

User: Analyze the iris dataset and give me key insights
Expected: Agent iteratively fetches data, runs statistical analysis, potentially creates visualizations

Checklist

I have read the CONTRIBUTING.md document.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have added tests that prove my fix is effective or that my feature works.
New and existing unit tests pass locally with my changes.
I have manually tested my changes end-to-end.
Any dependent changes have been merged and published in downstream modules.

Additional context

Design decisions:

Experimental decorator: CodingAgent is marked with @experimental to indicate this is a new feature that may evolve.
Default to ContainerCodeExecutor: Security-first approach - code executes in isolated Docker containers by default. Users can supply custom executors if needed.
HTTP IPC over direct execution: Tools run on the host, not in containers. This maintains security isolation while allowing full ToolContext capabilities.
Import allowlist: DEFAULT_SAFE_IMPORTS provides a conservative set of safe modules. Users can extend this for specific use cases (e.g., adding pandas.* for data analysis).
ReAct-style iteration: The agent can observe execution results and iteratively refine its approach, similar to how human developers debug code.

Future enhancements enabled by this architecture:

REPL/Jupyter-kernel style execution modes for interactive sessions
Long-context scaffolds inspired by arXiv:2512.24601
Sub-agent coding workflows (planner/tester/refactor sub-agents)
Additional sandbox backends beyond Docker

Related inspiration:

smolagents - HuggingFace's "agents that think in code"
Recursive Language Models - Long-context processing via code execution

Documentation:

Sample agent README: contributing/samples/coding_agent/README.md
Technical documentation: `contributing/samples/coding_agent/CODING_AGENT.md

Footnotes

Creating this new PR closing the old #4259 to implement the following in addition to CodingAgent

Summary of new changes

Deduplicates DEFAULT_SAFE_IMPORTS by importing the canonical value from
allowlist_validator.py and extending it via _EXTENDED_SAFE_IMPORTS for
CodingAgent defaults.
Resolves the merge conflict in src/google/adk/telemetry/tracing.py by
syncing with upstream and re-adding CodingAgent tracing helpers
(trace_code_generation, trace_code_execution, trace_import_validation,
trace_tool_ipc).

Testing

pytest tests/unittests/agents/test_coding_agent.py
pytest tests/unittests/code_executors/test_allowlist_validator.py

…ngAgent - Add save_chart tool to save visualizations to host filesystem - Add list_saved_charts tool to list saved charts - Add _is_real_error method to distinguish between warnings and errors - Fix pip warnings being treated as execution errors - Update system prompt with package installation instructions - Add base64 to authorized imports for chart encoding - Update README with new tool documentation - Create GitHub issue template for CodingAgent feature

…epth - Add research foundation from CodeAct (ICML 2024) and DynaSaur (COLM 2025) - Reference HuggingFace smolagents as inspiration (25k+ GitHub stars) - Expand problem statement with context window bottleneck analysis - Add detailed alternatives considered section with rationale - Include future roadmap for stateful execution and alternative sandboxes - Add concrete user pain points and how CodingAgent solves them

- Refocus motivation on arXiv:2512.24601 long-context external environment framing - Keep smolagents as primary inspiration for code-thinking agents - Tighten solution, alternatives, and future directions per ADK template

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

…y with upstream Addresses Gemini code review suggestion (PR google#4259): - Remove duplicate DEFAULT_SAFE_IMPORTS from coding_agent_config.py - Import DEFAULT_SAFE_IMPORTS from allowlist_validator.py (canonical source) - Create _DATA_SCIENCE_IMPORTS for numpy/pandas/scipy/matplotlib packages - Create _EXTENDED_SAFE_IMPORTS combining both for CodingAgentConfig default Resolves merge conflict in telemetry/tracing.py: - Sync with upstream main (new OTEL improvements, proper semconv imports) - Add CodingAgent-specific tracing functions: trace_code_generation, trace_code_execution, trace_import_validation, trace_tool_ipc Updates test to use _EXTENDED_SAFE_IMPORTS from coding_agent_config.py

gemini-code-assist · 2026-01-26T00:14:22Z

Summary of Changes

Hello @Sudhendra, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant new agent type, the CodingAgent, designed to overcome the limitations of traditional tool-calling agents when dealing with long-context workloads, complex task expressiveness, and the need for multi-step coding workflows. By enabling agents to 'think in code' and execute Python in a secure, sandboxed environment, it expands the capabilities of ADK to handle more open-ended and iterative problem-solving scenarios, bridging a gap in developer experience for building advanced AI systems.

Highlights

New CodingAgent Type: Introduced a new experimental CodingAgent that generates and executes Python code in a sandboxed environment, addressing limitations of fixed action sets for complex tasks.
Sandboxed Code Execution: Code generated by the agent runs securely within Docker containers via ContainerCodeExecutor, ensuring isolation from the host system.
Tool Integration via HTTP IPC: Generated Python code can safely call existing ADK tools through a ToolExecutionServer running on the host, using HTTP for inter-process communication.
Import Validation and Security: Implemented an AllowlistValidator to ensure that only authorized Python imports are permitted within the generated code, enhancing security.
ReAct-style Iteration and Stateful Execution: The agent employs a ReAct-style execution loop (Generate Code → Execute → Observe → Refine) and supports optional state persistence across iterations for multi-step tasks.
Comprehensive Telemetry: Integrated OpenTelemetry spans for key operations including code generation, execution, import validation, and tool IPC, providing detailed observability.
Data Analysis Sample Agent: A fully functional sample Data Analysis Agent is included, demonstrating the CodingAgent's capabilities in fetching, analyzing, and visualizing data.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a significant and well-designed new feature: the CodingAgent. The architecture, which uses sandboxed Docker containers for code execution and an HTTP IPC mechanism for tool calls, provides a robust and secure way for agents to "think in code". The implementation is comprehensive, covering configuration, state management, security via import allowlisting, and detailed telemetry. The included sample data analysis agent is an excellent demonstration of the new capabilities.

My review focuses on a few areas for improvement, primarily concerning resource management reliability (the use of __del__) and making some of the generated code and error handling even more robust. Overall, this is a high-quality contribution.

src/google/adk/code_executors/coding_agent_code_executor.py

src/google/adk/agents/coding_agent.py

src/google/adk/code_executors/tool_code_generator.py

contributing/samples/coding_agent/agent.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Sudhendra and others added 8 commits January 25, 2026 17:58

Initial implementation of CodingAgent for adk contribution

4312a92

successful data analyst code_agent test

5bc161e

tracing

7d08efc

Update src/google/adk/code_executors/coding_agent_code_executor.py

6eb94b5

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Sudhendra mentioned this pull request Jan 26, 2026

feat(agents): Add CodingAgent (agents that think in code) #4259

Closed

18 tasks

adk-bot added the core [Component] This issue is related to the core interface and implementation label Jan 26, 2026

Sudhendra mentioned this pull request Jan 26, 2026

feat(agents): Add CodingAgent (agents that think in code) #4198

Open

Merge branch 'main' into pr-4259

0077368

gemini-code-assist bot reviewed Jan 26, 2026

View reviewed changes

Sudhendra and others added 2 commits January 25, 2026 18:18

Update src/google/adk/code_executors/tool_code_generator.py

c32d426

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

fix(coding_agent): add context managers and narrow exception handling

406b9c5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agents): Add CodingAgent (agents that think in code) #4262

feat(agents): Add CodingAgent (agents that think in code) #4262

Sudhendra commented Jan 26, 2026

Uh oh!

gemini-code-assist bot commented Jan 26, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(agents): Add CodingAgent (agents that think in code) #4262

Are you sure you want to change the base?

feat(agents): Add CodingAgent (agents that think in code) #4262

Conversation

Sudhendra commented Jan 26, 2026

Link to Issue or Description of Change

Description

Testing Plan

Checklist

Additional context

Footnotes

Summary of new changes

Testing

Uh oh!

gemini-code-assist bot commented Jan 26, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants