Skip to content

Conversation

@himattm
Copy link
Collaborator

@himattm himattm commented Jan 23, 2026

Summary

  • Extract shared queue infrastructure into queue_core.py for code reuse between MCP server and CLI
  • Fix critical bug where cancelled sub-agents leave orphaned tasks that block the queue forever
  • Add PID reuse detection to distinguish dead processes from PIDs reused by unrelated processes (Chrome, etc.)
  • Add instance tracking (server_id) to detect orphaned tasks from crashed/restarted processes

Changes

New: queue_core.py

Shared infrastructure used by both task_queue.py (MCP server) and tq.py (CLI):

  • QueuePaths dataclass for consistent path management
  • Database functions: get_db(), init_db(), ensure_db()
  • Process management: is_process_alive(), is_task_queue_process(), kill_process_tree()
  • Queue cleanup: cleanup_queue() with orphan detection
  • Logging utilities

Bug Fix: Orphaned Task Cleanup

Problem: When sub-agents are cancelled, their queued tasks remained in the database forever, blocking subsequent tasks.

Solution:

  1. is_task_queue_process(pid) - Checks if a PID is actually running task_queue vs being reused by Chrome/Safari/etc.
  2. SERVER_INSTANCE_ID / CLI_INSTANCE_ID - Unique ID per process instance to detect stale tasks even if PID is reused
  3. _active_task_ids tracking - Server tracks which tasks are actively being processed
  4. asyncio.CancelledError handling - Properly cleans up when MCP clients disconnect

Other Improvements

  • Add server_id column to queue table (with migration for existing DBs)
  • Fix wait time calculation to use POLL_INTERVAL_WAITING instead of hardcoded value
  • Add comprehensive tests for orphan cleanup scenarios

Test plan

  • All 61 existing tests pass
  • Manual testing: Start 2 sub-agents with 60s tasks → Cancel them → Queue is empty
  • demo/test_pid_reuse.py verifies PID reuse detection works correctly
  • CLI still works correctly with instance tracking

🤖 Generated with Claude Code

himattm and others added 5 commits January 22, 2026 12:10
- Create queue_core.py with shared database, logging, and cleanup logic
- Refactor task_queue.py and tq.py to use shared module
- Fix Ctrl+C leaving orphaned waiting tasks by splitting registration from wait
- Stream output directly to file with bounded deque for memory efficiency
- Add signal handling integration tests for tq CLI

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add is_task_queue_process() to detect if PID is actually running task_queue
  vs being reused by an unrelated process (Chrome, etc.)
- Add SERVER_INSTANCE_ID and _active_task_ids tracking in MCP server to detect
  orphaned tasks from disconnected clients
- Add CLI_INSTANCE_ID tracking in tq.py for same protection
- Handle asyncio.CancelledError to properly clean up when sub-agents are cancelled
- Add server_id column to queue table with migration for existing databases
- Fix wait time calculation to use POLL_INTERVAL_WAITING instead of hardcoded value

This fixes the bug where cancelled sub-agents would leave orphaned tasks in the
queue that blocked all subsequent tasks from running.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Move queue_core import before module-level constants in task_queue.py and tq.py
- Remove unused imports and variables in tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Pin pypa/gh-action-pypi-publish to commit SHA (resolves unpinned action alert)
- Add nosec B602 comments explaining intentional shell=True usage:
  - CLI tool executes user-provided commands (like bash -c or make)
  - Shell features (pipes, redirects, globs) required for build commands
  - Input controlled by users who explicitly invoke the CLI/MCP tool

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@himattm himattm merged commit af89ef8 into main Jan 23, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants