Add RLM (Recursive Language Models) sample #4252

LiamConnell · 2026-01-23T20:18:41Z

Summary

This PR adds a new sample implementing Recursive Language Models (RLM) using ADK and Gemini models.

RLM enables LLMs to handle near-infinite length contexts by programmatically examining, decomposing, and recursively calling themselves through a Python REPL environment. This is an implementation of the concepts from the paper "Enabling Near-Infinite Length Context with Recursive Language Models" adapted to use Google's ADK.

Key Features

Recursive LLM Calls: LLMs can spawn sub-LLMs to analyze context chunks, with configurable max depth
Sandboxed Python REPL: Safe code execution environment with restricted builtins
Streaming Events: Real-time event streaming for UI integration (rlm.iteration.start, rlm.code.found, rlm.final.answer, etc.)
Multi-Turn Persistence: Maintain conversation state across turns using ADK sessions
JSONL Logging: Structured logs for debugging and visualization
File Loading: Lazy loading from local filesystem and Google Cloud Storage
Web UI: FastAPI-based interface with WebSocket streaming and Tokyo Night theme
CLI: Interactive REPL with Rich console output

Architecture

The sample demonstrates several ADK patterns:

Custom BaseAgent implementation (RLMAgent)
Custom BaseCodeExecutor for sandboxed REPL
Streaming events via AsyncGenerator[Event, None]
Session persistence with DatabaseSessionService

Test Plan

Unit Tests

The sample includes comprehensive unit tests that can be run without API access:

cd contributing/samples/rlm
uv pip install -e ".[dev]"
python -m pytest tests/ --ignore=tests/test_e2e.py --ignore=tests/test_gcs_integration.py -v

Tests cover:

REPL execution and sandboxing (test_repl.py)
Code block parsing (test_parsing.py)
File loading and lazy evaluation (test_files.py)
Multi-turn conversation handling (test_multi_turn.py)
Event streaming (test_simple_llm_events.py)
Usage tracking (test_usage.py)

E2E Tests (with LLM)

RLM_E2E_TESTS=true python -m pytest tests/test_e2e.py -v

Web UI Tests (with Playwright)

python -m pytest tests/ui/ -v  # Mocked WebSocket
RLM_E2E_TESTS=true python -m pytest tests/e2e/ -v  # Real server

Manual Testing

# Interactive CLI
python -m adk_rlm.cli

# Web interface
python -m adk_rlm.web

# ADK built-in web interface
adk web adk_rlm.agent

This sample implements Recursive Language Models using ADK and Gemini. RLM enables LLMs to handle near-infinite length contexts by programmatically examining, decomposing, and recursively calling themselves through a Python REPL environment. Features: - Recursive LLM calls with depth control - Sandboxed Python REPL with restricted builtins - Real-time streaming events for UI integration - Multi-turn conversation persistence - JSONL logging compatible with RLM visualizer - File loading from local filesystem and GCS - Web UI with Tokyo Night theme - Comprehensive test suite (unit, UI, E2E)

gemini-code-assist · 2026-01-23T20:20:35Z

Summary of Changes

Hello @LiamConnell, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new sample that implements Recursive Language Models (RLM) using Google's Agent Development Kit (ADK) and Gemini models. The RLM approach allows large language models to process extensive contexts by breaking down problems, executing code in a controlled environment, and recursively calling themselves. This enables near-infinite context length handling and provides detailed insights into the model's reasoning process through comprehensive event streaming and logging.

Highlights

Recursive LLM Calls: LLMs can spawn sub-LLMs to analyze context chunks, with configurable max depth, enabling hierarchical decomposition of complex problems.
Sandboxed Python REPL: A safe code execution environment with restricted builtins is provided, allowing LLMs to programmatically interact with data.
Streaming Events: Real-time event streaming (e.g., 'rlm.iteration.start', 'rlm.code.found', 'rlm.final.answer') is implemented for UI integration and granular visibility into execution.
Multi-Turn Persistence: Conversation state and REPL variables are maintained across turns using ADK sessions, supporting continuous interaction.
File System Integration: Lazy loading from local filesystem and Google Cloud Storage is supported, allowing efficient handling of large contexts.
Web UI and CLI: A FastAPI-based web interface with WebSocket streaming and a Tokyo Night theme, along with an interactive CLI with Rich console output, are provided for user interaction.
ADK Pattern Implementation: The sample demonstrates custom BaseAgent (RLMAgent), custom BaseCodeExecutor for the sandboxed REPL, streaming events via AsyncGenerator, and session persistence with DatabaseSessionService.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new sample for Recursive Language Models (RLM) using ADK and Gemini models. The implementation includes a robust agent, a sandboxed Python REPL, streaming events, multi-turn persistence, and file handling with lazy loading. The accompanying CLI and Web UI provide interactive ways to engage with the RLM. Comprehensive unit and E2E tests are also included, along with deployment scripts for GCP. Overall, the changes are well-structured and demonstrate a strong understanding of the RLM pattern and ADK framework. I've identified a few areas for improvement related to Dockerfile best practices, logging configuration, and encapsulation, which are detailed in the specific comments.

gemini-code-assist · 2026-01-23T20:31:25Z

contributing/samples/rlm/Dockerfile

+RUN mkdir -p adk_rlm && echo '__version__ = "0.1.0"' > adk_rlm/__init__.py
+
+# Install dependencies (this layer is cached unless pyproject.toml changes)
+RUN uv pip install --system -e ".[all]" --index-url https://pypi.org/simple/


Using --system with uv pip install inside a Dockerfile is generally not recommended. In a Docker image, the environment is already isolated, so installing into the system Python (even a slim one) bypasses uv's virtual environment management. It's often better to let uv manage a virtual environment within the container or use pip install directly if uv's virtual environment features aren't strictly needed in the final image layer.

gemini-code-assist · 2026-01-23T20:31:25Z

contributing/samples/rlm/Dockerfile

+EXPOSE 8080
+
+# Run the web server
+CMD ["sh", "-c", "python -m adk_rlm.web --host 0.0.0.0 --port $PORT"]


It's generally better to use the exec form of CMD in Dockerfiles (e.g., CMD ["python", "-m", "adk_rlm.web", "--host", "0.0.0.0", "--port", "$PORT"]). The sh -c form runs the command as a child process of sh, which can lead to issues with signal handling (e.g., SIGTERM not being passed to the Python process) and process IDs. Using the exec form ensures that signals are correctly propagated to your application.

gemini-code-assist · 2026-01-23T20:31:25Z

contributing/samples/rlm/adk_rlm/__init__.py

+_logger = _logging.getLogger(__name__)
+if not _logger.handlers:
+  _handler = _logging.StreamHandler()
+  _handler.setFormatter(
+      _logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
+  )
+  _logger.addHandler(_handler)
+  _logger.setLevel(_logging.WARNING)


Setting a default StreamHandler and logging level (WARNING) directly in __init__.py can interfere with an application's logging configuration. It's generally recommended for libraries to add a NullHandler to prevent "No handlers could be found for logger" messages if the application doesn't configure logging, and then provide a separate configure_logging function for explicit setup. This allows applications to have full control over logging output.

gemini-code-assist · 2026-01-23T20:31:25Z

contributing/samples/rlm/adk_rlm/__init__.py

+  for handler in logger.handlers:
+    handler.setLevel(level)
+    handler.setFormatter(_logging.Formatter(format))


The configure_logging function modifies existing handlers. If this function is called multiple times, it might lead to unintended modifications of handlers that were not meant to be reconfigured. For a complete reconfiguration, it's often safer to remove existing handlers before adding new ones, or ensure that the function is idempotent.

gemini-code-assist · 2026-01-23T20:31:25Z

contributing/samples/rlm/adk_rlm/agents/rlm_agent.py

+    )
+
+    # Initialize private attributes
+    self._client = genai.Client(vertexai=True, location="global")


Initializing genai.Client directly in the __init__ method can make testing more difficult and tightly couples the agent to a specific client configuration (e.g., vertexai=True, location="global"). Consider making the genai.Client configurable via dependency injection (passed as an argument) or initializing it lazily with a factory function. This improves testability and flexibility for different deployment environments.

gemini-code-assist · 2026-01-23T20:31:25Z

contributing/samples/rlm/adk_rlm/agents/rlm_agent.py

+          # Execute code asynchronously while streaming child events in real-time
+          from google.adk.code_executors.code_execution_utils import CodeExecutionInput
+
+          # Reset queue state BEFORE starting the task to avoid race conditions


This import statement is inside a loop. It should be moved to the top of the file to avoid repeated imports, which can have a minor performance impact and is generally considered bad practice for readability and maintainability.

gemini-code-assist · 2026-01-23T20:31:25Z

contributing/samples/rlm/adk_rlm/agents/rlm_agent.py

+
+        # Also check REPL locals for FINAL_VAR pattern
+        if final_answer is None:
+          final_answer = find_final_answer(response_text, executor._repl)


Accessing executor._repl directly from RLMAgent violates encapsulation, as _repl is a private attribute of RLMCodeExecutor. It would be better if RLMCodeExecutor exposed a public method (e.g., get_final_answer_from_repl_state()) to retrieve this information, or if find_final_answer was designed to work with the executor object's public interface.

gemini-code-assist · 2026-01-23T20:31:25Z

contributing/samples/rlm/adk_rlm/files/loader.py

+    if lazy:
+      file_collection = self.create_lazy_files(files)
+      return {


The if context is None: raise ValueError(...) check and the subsequent ctx = context assignment seem to belong to the completion function's logic rather than FileLoader's build_context method. This indicates a slight mixing of concerns. FileLoader's primary responsibility should be to load and process files, not to validate the overall context input for the RLM system. Consider moving this validation and context merging logic to the completion function or a dedicated context preparation utility.

gemini-code-assist bot reviewed Jan 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RLM (Recursive Language Models) sample #4252

Add RLM (Recursive Language Models) sample #4252

Uh oh!

LiamConnell commented Jan 23, 2026

Uh oh!

gemini-code-assist bot commented Jan 23, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Uh oh!

gemini-code-assist bot Jan 23, 2026

Uh oh!

gemini-code-assist bot Jan 23, 2026

Uh oh!

gemini-code-assist bot Jan 23, 2026

Uh oh!

gemini-code-assist bot Jan 23, 2026

Uh oh!

gemini-code-assist bot Jan 23, 2026

Uh oh!

gemini-code-assist bot Jan 23, 2026

Uh oh!

gemini-code-assist bot Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add RLM (Recursive Language Models) sample #4252

Are you sure you want to change the base?

Add RLM (Recursive Language Models) sample #4252

Uh oh!

Conversation

LiamConnell commented Jan 23, 2026

Summary

Key Features

Architecture

Test Plan

Unit Tests

E2E Tests (with LLM)

Web UI Tests (with Playwright)

Manual Testing

Uh oh!

gemini-code-assist bot commented Jan 23, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant