-
Notifications
You must be signed in to change notification settings - Fork 208
feat: add temporal query parameters to MCP tools #625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add temporal query parameters to MCP tools #625
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThis PR adds a new Changes
Sequence Diagram(s)sequenceDiagram
participant Client as MCP Client
participant MCPServer as MCP Server<br/>(packages/mcp)
participant WebAPI as Web API<br/>(packages/web)
participant DB as Database<br/>(Prisma)
participant Git as Git Repository<br/>(simple-git)
Client->>MCPServer: search_commits(repoId, query, since, until, author)
MCPServer->>WebAPI: POST /api/commits<br/>(with params)
WebAPI->>DB: Find repository by ID/name
DB-->>WebAPI: Repository metadata + path
WebAPI->>WebAPI: validateDateRange(since, until)
WebAPI->>WebAPI: toGitDate(since, until)<br/>(format for git log)
WebAPI->>Git: git log<br/>(--since, --until, --author, --grep)
Git-->>WebAPI: Commit objects[]<br/>(hash, date, message, refs, body, author)
WebAPI->>WebAPI: Map to response schema
WebAPI-->>MCPServer: searchCommitsResponse
MCPServer-->>Client: Commit results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (10)
packages/web/src/features/search/searchApi.test.ts (1)
4-31: Unused mocks add unnecessary coupling to external modulesThis file only uses
dateUtils, but still mocks@/lib/serviceError,@/actions, and@/lib/auth. That adds extra surface area that can break these tests if those modules change, without providing value here.You can either:
- Remove these mocks from this file, or
- Add actual
searchApiintegration tests that rely on them, so the mocks are justified.packages/web/src/features/search/dateUtils.test.ts (1)
288-292: ClarifytoGitDatebehavior for unrecognized formatsThe test name and comment say “convert unrecognized format to ISO”, but the assertion expects the original string to be preserved:
// For formats git doesn't natively understand, we convert to ISO const result = toGitDate('some random string'); expect(result).toBe('some random string');This is confusing for future readers. Either:
- Update the expectation to actually check for an ISO conversion, if that’s the intended behavior, or
- Rename the test and fix the comment to state that unknown formats are passed through unchanged to Git.
packages/web/src/app/api/(server)/repos/route.ts (1)
5-20: Temporal filters are correctly forwarded; consider normalizing empty query paramsThe new GET handler cleanly threads
activeAfter/activeBeforethrough togetReposand preserves the existing error handling pattern.One minor robustness tweak:
searchParams.get(...) ?? undefinedwill pass through an empty string when the URL contains?activeAfter=or?activeBefore=. If you prefer to treat “present but empty” the same as “unset”, you could normalize with:const activeAfter = searchParams.get("activeAfter") || undefined; const activeBefore = searchParams.get("activeBefore") || undefined;This avoids feeding accidental empty strings into downstream date parsing.
packages/web/src/app/api/(server)/commits/route.ts (1)
1-34: Solid validation flow; tighten type usage / schema reuseThe POST handler’s flow (JSON → zod
safeParseAsync→schemaValidationError/serviceErrorResponse→searchCommits) looks good and matches the rest of the API.Two minor nits:
SearchCommitsRequestis imported but never used. Either remove the import or use it to type the validated payload (e.g.,const response = await searchCommits(parsed.data as SearchCommitsRequest);) so this route stays coupled to the shared request type.- If you already have a shared Zod schema for
searchCommits(e.g., alongside other search schemas), consider reusing it here instead of defining a new one inline to avoid future drift between layers.packages/mcp/src/schemas.ts (1)
195-214: Consider constrainingmaxCountto avoid pathological requestsThe new
searchCommitsRequestSchema/commitSchemapair matches the documented shape and looks consistent.
maxCountis currentlyz.number().optional(), which allows negative values or extremely large limits. Given this is likely passed through togit log(and could be user‑controlled via MCP), you may want to harden it, e.g.:maxCount: z .number() .int() .positive() .max(500) .optional(),(or whatever upper bound matches your performance expectations).
packages/web/src/features/search/dateUtils.ts (1)
6-26: ClarifyparseTemporalDatecontract and clean up unused stateThe JSDoc and implementation for
parseTemporalDatedon’t currently line up:
- Doc says it “Returns an ISO 8601 date string or null if invalid”, but the function:
- Returns
undefinedwhen!dateStr.- Returns the original
dateStrwhen parsing fails (so not necessarily ISO).- Callers like
toDbDateandvalidateDateRangeimplicitly treat “unparseable” asundefined/Invalid Date, whiletoGitDateis fine with git-native strings.This makes the effective contract subtle and easy to misread from the JSDoc.
Consider tightening this up by either:
- Updating the JSDoc and return type to something like “ISO 8601 when parsed, otherwise the original string, or
undefinedwhen input is falsy”, or- Returning
undefinedfor unparseable inputs and lettingtoGitDateown the “fallback to original for git” behavior.Also,
relativePatternsis currently unused and can be removed or wired into the parsing logic to avoid dead code.Also applies to: 35-41, 99-102, 144-156
packages/web/src/features/search/gitApi.test.ts (1)
11-19: Align error-handling tests with productionsewbehaviorIn tests you mock
sewas a simple passthrough:vi.mock('@/actions', () => ({ sew: (fn: () => any) => fn(), }));and later assert that
searchCommitsrejects for “other Error instances” and non-Error exceptions. In production,searchCommitsis wrapped by the realsewfromactions.ts, which catches thrown errors and returns aServiceErrorinstead of letting the promise reject.That means the “throws” behavior in these tests doesn’t match what callers will see at runtime. To avoid surprises, consider either:
- Asserting on a
ServiceErrorshape for these cases (matching realsew), or- Removing the
sewdependency fromsearchCommitsand letting routes/actions wrap it, if you explicitly wantsearchCommitsitself to throw.Either way, having tests reflect the actual public contract of
searchCommitswill make future changes safer.Also applies to: 287-347
packages/web/src/actions.ts (1)
463-541: HandleServiceErrorresults fromsearchCommitsexplicitly in activity filteringIn the activity-based branch of
getRepos:const activityChecks = await Promise.all(repos.map(async (repo) => { try { const commits = await searchCommits({ ... }); if (Array.isArray(commits) && commits.length > 0) { return repo; } } catch (e) { // ... } return null; }));
searchCommitsis already wrapped insew(ingitApi.ts), so for git/log failures it resolves to aServiceErrorrather than throwing. As a result:
- The
catchblock here will rarely, if ever, run for git-related issues.- Cases like “repo not on disk”, “invalid date range”, or git timeouts will show up as a
ServiceErrorvalue, get treated as “no activity”, and be silently filtered out with no logging.If you want
list_reposwithactiveAfter/activeBeforeto surface date-range validation failures instead of just returning an empty list, and to log unexpected git errors, consider branching explicitly on the union:const result = await searchCommits({ repoId: repo.id, since: activeAfter, until: activeBefore, maxCount: 1, }); if (Array.isArray(result)) { return result.length > 0 ? repo : null; } // `result` is a ServiceError here const message = result.message ?? ''; if (!message.includes('does not exist')) { console.error( `Error checking activity for repo ${repo.id} (${repo.name}):`, result, ); } return null;This keeps “repo not yet cloned” behaving as “no activity” while making invalid ranges and other unexpected failures visible.
Separately, this wiring introduces a circular dependency:
actions.tsimportssearchCommitsfrom./features/search/gitApi, andgitApi.tsimportssewfrom@/actions. It works today becausesewis only used at call time, but it’d be cleaner long term to extractsewto a small shared module (or inject it) to avoid subtle initialization issues in future refactors.packages/web/src/features/search/gitApi.ts (1)
1-22:searchCommitsimplementation looks solid; consider small cleanupsOverall, this module cleanly encapsulates git-based commit search:
- Validates the repo path and date range up front.
- Uses
toGitDateto preserve git-native formats when possible.- Configures
simpleGitwith a sensible timeout and concurrency.- Provides clear, user-facing messages for common git failures (invalid repo, ambiguous args, timeout), with a generic fallback.
A couple of minor nits you might consider:
- You check
existsSync(repoPath)both before callingcreateGitClientForPathand inside that helper. One of those checks could be dropped unless you specifically want the second check to catch races and produce a different message.- In the catch block you return
unexpectedError(...)for known cases but throw for unknown ones, relying onsewto convert those into genericServiceErrors. That’s fine, but it means callers always seeServiceError(never a thrown error) in production, so other code (likegetRepos) should treat the result asCommit[] | ServiceErrorrather than depending on exceptions.These are non-blocking; the core behavior and error mapping look good.
Also applies to: 58-150
packages/mcp/src/index.ts (1)
233-347:list_reposclient‑side filtering/pagination looks good; confirmlistReposreturns the full filtered set.The flow of: call
listRepos({ activeAfter, activeBefore }), then apply name filtering, sort, computetotalCount, and paginate on the MCP side cleanly fixes the “take before filter” bug, and the new empty/out‑of‑range page messages (including the temporal‑filtering note) are clear.This does rely on
listReposreturning the complete set of repositories for the given temporal filters (i.e., not already paginated). If the underlying API still pages results,totalCount,totalPages, and the “Try pageNumber between 1 and X” guidance will all be based only on that first backend page.If
listReposis still paginated underneath, consider either:
- Fetching all pages when temporal filters are used, or
- Threading backend pagination parameters and totals through to this tool instead of recomputing them locally.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
packages/backend/src/constants.ts(2 hunks)packages/mcp/CHANGELOG.md(1 hunks)packages/mcp/README.md(4 hunks)packages/mcp/src/client.ts(3 hunks)packages/mcp/src/index.ts(6 hunks)packages/mcp/src/schemas.ts(1 hunks)packages/mcp/src/types.ts(2 hunks)packages/shared/src/constants.ts(1 hunks)packages/web/src/actions.ts(2 hunks)packages/web/src/app/api/(server)/commits/route.ts(1 hunks)packages/web/src/app/api/(server)/repos/route.ts(1 hunks)packages/web/src/features/search/dateUtils.test.ts(1 hunks)packages/web/src/features/search/dateUtils.ts(1 hunks)packages/web/src/features/search/gitApi.test.ts(1 hunks)packages/web/src/features/search/gitApi.ts(1 hunks)packages/web/src/features/search/schemas.ts(1 hunks)packages/web/src/features/search/searchApi.test.ts(1 hunks)packages/web/src/features/search/searchApi.ts(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Test Web
packages/shared/src/constants.ts
[error] 37-37: TypeError: The "path" argument must be of type string. Received undefined
🔇 Additional comments (9)
packages/web/src/features/search/searchApi.test.ts (1)
33-323: Temporal coverage in this suite looks solidThe tests exercise ISO dates, relative phrases, missing parameters, repository filtering, and inverted ranges with fixed system time, which should give good confidence that temporal parameters are wired correctly into query construction and range validation (via
dateUtils).packages/web/src/features/search/dateUtils.test.ts (1)
9-378: Thorough temporal parsing and range validation testsThe suite covers a wide range of ISO/relative formats, boundary conditions (month/year/leap year, midnight/end-of-day), and integration scenarios across
parseTemporalDate,validateDateRange,toDbDate, andtoGitDate. That level of coverage should make regressions in date handling unlikely.packages/backend/src/constants.ts (1)
11-11: Re-exporting cache-dir constants from shared module looks appropriateRe-exporting
REPOS_CACHE_DIRandINDEX_CACHE_DIRfrom@sourcebot/sharedkeeps the backend public surface unchanged while centralizing the actual definitions in one place, which should reduce divergence between services.packages/mcp/src/types.ts (1)
13-15: Commit-search types are correctly wired into the public APIThe new
SearchCommitsRequest,SearchCommitsResponse, andCommitexports cleanly mirror the schemas and extend the MCP surface without breaking existing callers. Just keep this file in sync with the corresponding web types per the header note.Also applies to: 36-38
packages/mcp/README.md (1)
169-205: Temporal docs align well with the new APIsThe additions for
search_code(gitRevision,since,until,includeDeletedFiles),list_repos(activeAfter/activeBefore), and the newsearch_commitssection clearly describe index‑time vs commit‑time semantics and supported date formats. These explanations match the corresponding schemas and server logic and should make the new temporal behavior much easier to understand for MCP clients.Also applies to: 221-253, 243-253
packages/web/src/actions.ts (1)
315-321: API key creation guard for non-owner users looks correctThe new check against
EXPERIMENT_DISABLE_API_KEY_CREATION_FOR_NON_ADMIN_USERSwith an earlyFORBIDDENServiceErrorand clear log message is consistent with the rest of the file’s patterns and should be safe to ship.packages/mcp/CHANGELOG.md (1)
10-29: Changelog entries clearly capture the new temporal MCP capabilitiesThe “Unreleased” section accurately reflects the added tools/parameters, temporal format support, error handling, and the list_repos pagination fix, and clearly distinguishes index time vs commit time. This is consistent with the PR’s behavior and keeps consumers informed.
packages/mcp/src/client.ts (1)
2-4: MCP client wiring for temporal repos andsearchCommitsis consistent and safeThe updated
listRepossignature (optionalparamswithactiveAfter/activeBefore) and the newsearchCommitsclient:
- Follow the same fetch +
isServiceError+ schema-parse pattern as the existing methods.- Remain backwards compatible for existing
listRepos()callers.- Correctly propagate org domain and optional API key headers.
No issues from the client side.
Also applies to: 24-47, 67-83
packages/mcp/src/index.ts (1)
52-67: Temporal params forsearch_codeare plumbed correctly; just confirm index‑time semantics.The new
gitRevision,since,until, andincludeDeletedFilesfields are optional, preserve existing query composition, and are passed through tosearch()without breaking current callers. Please double‑check thatsearch({ since, until })actually applies the index‑time filters described here; if it instead uses commit time, consider aligning the implementation or updating the descriptions so it doesn’t conflict conceptually withsearch_commitsandlist_repos.Also applies to: 80-83, 103-106
36518e4 to
22bf69e
Compare
|
@coderabbitai please review again |
|
@waynesun09, I'll review the changes again for you. ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (12)
packages/mcp/README.md (1)
220-252: Verify that all documented relative date formats are actually supportedThe Date Format Examples list
"last month"in addition to"30 days ago","1 week ago","last week","yesterday". Please confirmparseTemporalDate(and any schema refinements) really accept"last month"; if not, either add support or trim the docs to only list supported forms.packages/shared/src/index.server.ts (1)
29-32: Server cache directory constants re-export looks goodRe-exporting
REPOS_CACHE_DIRandINDEX_CACHE_DIRhere keeps server-only constants centralized. Just ensure these exports are only imported from server-side code paths to avoid accidental bundling into client bundles.CHANGELOG.md (1)
12-14: Temporal filtering changelog entry is accurate; consider optionally mentioning commit searchThe new “temporal filtering” bullet correctly captures branch/revision and index date filtering plus relative date support. If you want full traceability from the root changelog, you might also briefly call out the new commit-time search surface (
searchCommits), but that’s optional since MCP-level changes are already documented separately.packages/shared/src/constants.server.ts (1)
1-7: Env-aware cache directory constants are reasonableUsing
env.DATA_CACHE_DIRwith/tmp/sourcebot/...fallbacks is a pragmatic approach and the guard avoids load-time failures when the env var is unset. If you ever want to DRY this up, you could factor out a small helper for the base cache dir, but it’s not necessary.packages/web/src/features/search/searchApi.ts (3)
27-62: Search/streamSearch integration of permission, temporal, and branch filters looks coherentUsing
getAccessibleRepoNamesForUserplusgetTemporallyFilteredRepos, then intersecting viacombineRepoFilters, gives a clear repoSearchScope that respects both permissions and index-time constraints. ApplyingapplyBranchFilteron theQueryIRwhengitRevisionis present is a sensible way to express branch selection in a Zoekt-friendly IR.From a behavior standpoint, this matches the documented semantics:
- temporal filters operate at the repo (indexedAt) level, not commit time, and
- branch/revision filtering is orthogonal via
gitRevision.No correctness issues spotted here.
105-194: Temporal repo filter helpers are correct; small refinements are optional
getTemporallyFilteredReposandcombineRepoFiltersdo the right thing:
- They only apply temporal filtering when
sinceoruntilare provided.- They use
toDbDateto normalize dates and gracefully skip bounds that resolve toundefined.- They intersect permission-based and temporal repo sets when both are present, returning
undefinedwhen no filters apply (preserving prior unfiltered behavior).Two optional refinements you might consider (not blockers):
- If schema/dateUtils don’t already enforce it, ensure
since/untilare validated (includingsince <= until) before reaching this helper, so we never silently treat an invalid date string as “no temporal filtering”.- If you care about minimizing DB round trips, a future refactor could merge permission and temporal conditions into a single
findManycall instead of two separate queries.Overall, the helper logic is sound.
196-214: Branch filter wrapper aligns with a composable QueryIR design
applyBranchFilterwrapping the existingQueryIRinside anandnode with abranchconstraint is a clean, composable approach. It should play nicely even if the incoming IR already contains other filters.If there are any special cases for
gitRevisionvalues (e.g., a wildcard such as"*"meaning “all branches”), double-check that callers either normalize those before reaching this function or handle them separately, since this helper currently always applies an exact branch match.packages/web/src/features/search/gitApi.test.ts (1)
98-172: Consider centralizing and restoring dateUtils spies for test robustnessYou call
vi.spyOn(dateUtils, 'validateDateRange')/toGitDatein multiple tests and suites while only usingvi.clearAllMocks()in the top-levelbeforeEach. To avoid any subtle interactions or “already spied” issues as Vitest/Jest evolve, it’d be a bit more robust to either:
- Create the spies once in a
beforeAll/outerbeforeEachand just adjustmockReturnValue/mockImplementationper test, or- Add
vi.restoreAllMocks()in anafterEachso eachspyOnstarts from the real implementation.Not urgent, but it can make future test changes safer.
Also applies to: 239-271, 317-383
packages/mcp/src/schemas.ts (1)
200-219: Avoid future drift by sharing commit search shapes between web and MCP
searchCommitsRequestSchema,commitSchema, andsearchCommitsResponseSchemamatch the web types and backendCommitinterface today, but they’re now defined in three places. When convenient, consider moving these schemas/types into a small shared package (or exporting from a single canonical module) and importing them here and inweb/gitApi.tsto reduce the risk of subtle mismatches over time.packages/web/src/features/search/gitApi.ts (1)
61-78: Unify commit search types with shared schemas to reduce duplication
SearchCommitsRequestandCommithere duplicate the shapes already expressed inweb/src/features/search/types.tsandmcp/src/schemas.ts. To keep things consistent and automatically pick up schema changes (likemaxCountbounds), consider:
- Importing
SearchCommitsRequestandCommitfrom the shared Zod-based types, or- Exporting a single canonical TS type for these and reusing it in all three locations.
This is mainly a maintainability tweak; behavior is fine as-is.
Also applies to: 200-219
packages/mcp/src/index.ts (2)
202-202: Align schema default with documentation.The description states "default: 50" but the schema doesn't enforce this default with
.default(50). IfmaxCountis omitted, it will beundefined, requiring the backend to handle the default.Consider adding
.default(50)to the schema for consistency:- maxCount: z.number().describe(`Maximum number of commits to return (default: 50).`).optional(), + maxCount: z.number().int().positive().default(50).describe(`Maximum number of commits to return (default: 50).`),This ensures the schema validates as an integer, is positive, and matches the documented default behavior.
298-303: Consider including indexedAt in output when temporal filtering is applied.When users filter repositories by
activeAfteroractiveBefore, they might want to see when each repository was last indexed to understand the filtering results. Currently, the output only includesidandurl.Consider conditionally including the
lastIndexedorindexedAtfield when temporal parameters are provided:const content: TextContent[] = paginated.map(repo => { let text = `id: ${repo.repoName}\nurl: ${repo.webUrl}`; if ((activeAfter || activeBefore) && repo.indexedAt) { text += `\nlast_indexed: ${repo.indexedAt}`; } return { type: "text", text, } });This would help users verify that the temporal filtering is working as expected.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
CHANGELOG.md(1 hunks)packages/mcp/CHANGELOG.md(1 hunks)packages/mcp/README.md(4 hunks)packages/mcp/src/client.ts(3 hunks)packages/mcp/src/index.ts(5 hunks)packages/mcp/src/schemas.ts(4 hunks)packages/mcp/src/types.ts(2 hunks)packages/shared/src/constants.server.ts(1 hunks)packages/shared/src/index.server.ts(1 hunks)packages/web/src/actions.ts(1 hunks)packages/web/src/app/api/(server)/commits/route.ts(1 hunks)packages/web/src/app/api/(server)/repos/route.ts(1 hunks)packages/web/src/features/search/dateUtils.test.ts(1 hunks)packages/web/src/features/search/dateUtils.ts(1 hunks)packages/web/src/features/search/gitApi.test.ts(1 hunks)packages/web/src/features/search/gitApi.ts(1 hunks)packages/web/src/features/search/searchApi.ts(4 hunks)packages/web/src/features/search/types.ts(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- packages/mcp/src/types.ts
- packages/mcp/src/client.ts
- packages/web/src/features/search/dateUtils.ts
- packages/web/src/actions.ts
🧰 Additional context used
🧬 Code graph analysis (7)
packages/web/src/app/api/(server)/repos/route.ts (3)
packages/web/src/lib/types.ts (1)
GetReposResponse(32-32)packages/web/src/actions.ts (1)
getRepos(467-511)packages/web/src/app/api/(client)/client.ts (1)
getRepos(53-62)
packages/shared/src/constants.server.ts (1)
packages/shared/src/index.server.ts (3)
REPOS_CACHE_DIR(30-30)env(34-34)INDEX_CACHE_DIR(31-31)
packages/web/src/features/search/gitApi.test.ts (1)
packages/web/src/features/search/gitApi.ts (1)
searchCommits(98-197)
packages/web/src/features/search/searchApi.ts (3)
packages/web/src/withAuthV2.ts (1)
withOptionalAuthV2(43-67)packages/web/src/features/search/parser.ts (1)
parseQuerySyntaxIntoIR(55-108)packages/web/src/features/search/dateUtils.ts (1)
toDbDate(143-155)
packages/web/src/features/search/dateUtils.test.ts (1)
packages/web/src/features/search/dateUtils.ts (4)
parseTemporalDate(21-101)validateDateRange(110-134)toDbDate(143-155)toGitDate(164-186)
packages/mcp/src/index.ts (3)
packages/mcp/src/client.ts (2)
searchCommits(67-83)listRepos(24-47)packages/web/src/features/search/gitApi.ts (1)
searchCommits(98-197)packages/web/src/lib/utils.ts (1)
isServiceError(430-436)
packages/web/src/features/search/gitApi.ts (7)
packages/mcp/src/types.ts (2)
ServiceError(33-33)SearchCommitsRequest(35-35)packages/web/src/lib/serviceError.ts (1)
unexpectedError(75-81)packages/web/src/features/search/types.ts (1)
SearchCommitsRequest(178-178)packages/web/src/actions.ts (1)
sew(44-61)packages/web/src/withAuthV2.ts (1)
withOptionalAuthV2(43-67)packages/shared/src/constants.server.ts (1)
REPOS_CACHE_DIR(6-6)packages/web/src/features/search/dateUtils.ts (2)
validateDateRange(110-134)toGitDate(164-186)
🔇 Additional comments (15)
packages/mcp/README.md (1)
169-205: Temporal filtering docs forsearch_code/list_reposlook consistentThe distinction between index time (
search_code,list_repos) and commit time (search_commits) is clearly documented and matches the intended behavior. No issues from a tooling/UX perspective.packages/mcp/CHANGELOG.md (1)
10-25: Changelog entry is clear and matches the implementation scopeThe Unreleased “Added” bullets accurately summarize the new temporal features, commit search tool, validation, timeouts, and tests. This will be very helpful for downstream consumers of
@sourcebot/mcp.packages/web/src/app/api/(server)/repos/route.ts (1)
5-16: GET handler wiring foractiveAfter/activeBeforelooks correctReading temporal filters from
request.nextUrl.searchParamsand passing them through togetRepospreserves existing behavior when absent and enables index-time filtering when present. AssuminggetRepos/toDbDatehandle malformed values appropriately (either via validation or safe fallback), this route looks good.packages/web/src/app/api/(server)/commits/route.ts (1)
1-24: Commit search POST route is straightforward and aligned with the schemaThe handler cleanly validates via
searchCommitsRequestSchema.safeParseAsync, delegates tosearchCommits, and consistently converts both validation and service errors intoserviceErrorResponse. This matches the documentedsearch_commitsbehavior.packages/web/src/features/search/types.ts (2)
86-95: Index / branch-filter options look consistent and well-documentedThe added
gitRevision,since,until, andisBranchFilteringEnabledfields are coherent with the PR’s temporal story and clearly documented as index-time filters here. No changes needed from this file.Also applies to: 104-110
170-178: Commit search request schema is well-shapedThe
searchCommitsRequestSchema(including themaxCountbounds) matches the backend/mcp usage and gives a solid contract for the newsearch_commitstool. Looks good as-is.packages/web/src/features/search/gitApi.test.ts (1)
1-68: Thorough coverage of commit search behaviorThe mocking setup plus the success/error/integration and repo-resolution cases exercise essentially every branch in
searchCommits, including disk checks, date handling, git options, and error mapping. This gives strong confidence in the new API surface.Also applies to: 273-574
packages/web/src/features/search/dateUtils.test.ts (1)
1-379: Deterministic and comprehensive date utility testsThe combination of fake timers with broad coverage of ISO, relative phrases, edge cases, and integration scenarios gives very solid protection around
parseTemporalDate/validateDateRange/toDbDate/toGitDate. No changes needed here.packages/mcp/src/schemas.ts (1)
24-33: Search options / response schema stay in sync with web typesThe added temporal fields and
isBranchFilteringEnabledflag correctly mirrorpackages/web/src/features/search/types.ts, honoring the “keep in sync” note and keeping MCP contracts aligned with the web API.Also applies to: 120-149
packages/web/src/features/search/gitApi.ts (2)
12-21: Repo resolution and git client setup look correct
createGitClientForPathandresolveRepoIdwiresimple-gitand Prisma together cleanly: numeric IDs short-circuit, string names are scoped byorgId, and the not-found path returns a clearunexpectedErrorwith guidance to uselist_repos. No functional issues here.Also applies to: 32-59
80-197: Commit search flow and error handling are solidThe
searchCommitsimplementation cleanly composessew+withOptionalAuthV2, validates repo existence, usesvalidateDateRange/toGitDatebefore constructinggit.logoptions, and maps common git failures (not a git repository,ambiguous argument,timeout) to actionableServiceErrors while letting unexpected issues bubble tosew. Together with the tests, this looks robust.packages/mcp/src/index.ts (4)
52-63: LGTM! Clear distinction between index time and commit time filtering.The addition of
gitRevision,since, anduntilparameters is well-documented. The explicit note "(NOT by commit time)" in the descriptions ofsinceanduntilhelps prevent confusion with thesearch_commitstool.
76-78: LGTM! Parameters correctly threaded through to search function.The new temporal parameters are properly destructured from the input and passed to the
searchfunction, maintaining consistency with the existing parameter handling pattern.Also applies to: 94-96
204-212: LGTM! Clean parameter pass-through to searchCommits.The handler correctly passes all parameters directly to the
searchCommitsfunction, with appropriate error handling in the following lines.
269-269: LGTM! Temporal parameters correctly passed to backend.The
activeAfterandactiveBeforeparameters are properly passed to thelistReposclient function for server-side filtering by index time.
22bf69e to
552a4e9
Compare
|
@coderabbitai please review again |
|
@waynesun09, I'll review the changes for you. ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
packages/mcp/README.md (1)
169-205: Temporal docs forsearch_code/list_reposare clear; consider clarifying semanticsThe distinction that these filters operate on index time (not commit time) and accept ISO or relative formats is well documented. If you want to avoid ambiguity, consider briefly stating whether bounds are inclusive and which timezone the server uses to interpret relative phrases (e.g., “30 days ago” evaluated in server-local time vs UTC).
packages/web/src/features/search/dateUtils.ts (3)
21-75:parseTemporalDatecovers key formats; minor robustness improvements possibleThe ISO-first parsing and explicit handling of “yesterday” /
N <unit>s agolook good and match the documented relative formats. To tighten things up:
- Use an explicit radix in
parseInt(matchRelative[1], 10)to avoid historicalparseIntfootguns.- Consider documenting that “last week/month/year” and
N <unit>s agoare evaluated relative to the current timestamp (not normalized to start-of-week/month/year), so behavior is clear to callers.
110-155: Date range validation and DB conversion behave reasonably; consider surfacing invalid inputs
validateDateRangeandtoDbDatecorrectly useparseTemporalDateand gracefully handle unparsable values by returningnull/undefined. One tradeoff is that a user-supplied but invalidsince/untilsilently degrades to “no bound” instead of producing a clear error. If you want stricter UX for index-time filters (e.g.,search_code,list_repos), consider havingvalidateDateRangereturn an error string when either side is unparsable, and wiring that into the HTTP/MCP response for those call sites (similar tosearch_commits).
164-185:toGitDatepreserves git-native formats; keep patterns in sync with parserThe whitelist in
gitNativeFormatscorrectly preserves relative and ISO forms that Git already understands and falls back toparseTemporalDateotherwise. To avoid future drift, consider centralizing these regexes (or a shared enum of supported relative tokens) withparseTemporalDateso adding a new relative form only requires updating one place.packages/web/src/features/search/gitApi.test.ts (2)
5-49: Mock setup cleanly isolatessearchCommits; watch for future module-surface changesThe targeted mocks for
simple-git,fs,@sourcebot/shared,@/lib/serviceError,@/actions,@/withAuthV2, and@/lib/utilsgive you good control over dependencies and error paths. If@sourcebot/shared’s surface grows andgitApistarts importing additional exports from it, this full-module mock will need updating—at that point, consider usingvi.importActualand overriding onlyREPOS_CACHE_DIRto reduce maintenance.
71-96: Repository-path assertions are OS-specific; consider usingpath.joinin expectationsThe tests asserting
existsSyncis called with'/mock/cache/dir/456'(and similar) assume POSIX-style separators, butgitApiusespath.join, which will produce\-separated paths on Windows. To keep tests OS-agnostic, you could either:
- Import
pathand assert againstpath.join('/mock/cache/dir', '456'), or- Use
expect.stringContaining('/mock/cache/dir')+ repo ID, if exact separators aren’t important.Also applies to: 456-472
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
CHANGELOG.md(1 hunks)packages/mcp/CHANGELOG.md(1 hunks)packages/mcp/README.md(4 hunks)packages/mcp/src/client.ts(3 hunks)packages/mcp/src/index.ts(6 hunks)packages/mcp/src/schemas.ts(4 hunks)packages/mcp/src/types.ts(2 hunks)packages/shared/src/constants.server.ts(1 hunks)packages/shared/src/index.server.ts(1 hunks)packages/web/src/actions.ts(1 hunks)packages/web/src/app/api/(server)/commits/route.ts(1 hunks)packages/web/src/app/api/(server)/repos/route.ts(1 hunks)packages/web/src/features/search/dateUtils.test.ts(1 hunks)packages/web/src/features/search/dateUtils.ts(1 hunks)packages/web/src/features/search/gitApi.test.ts(1 hunks)packages/web/src/features/search/gitApi.ts(1 hunks)packages/web/src/features/search/searchApi.ts(4 hunks)packages/web/src/features/search/types.ts(3 hunks)
✅ Files skipped from review due to trivial changes (1)
- packages/web/src/features/search/searchApi.ts
🚧 Files skipped from review as they are similar to previous changes (7)
- packages/web/src/actions.ts
- packages/shared/src/constants.server.ts
- packages/web/src/features/search/dateUtils.test.ts
- packages/mcp/src/types.ts
- packages/web/src/app/api/(server)/repos/route.ts
- packages/web/src/app/api/(server)/commits/route.ts
- CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (3)
packages/mcp/src/schemas.ts (1)
packages/web/src/features/search/types.ts (1)
searchCommitsRequestSchema(170-177)
packages/mcp/src/client.ts (5)
packages/mcp/src/types.ts (4)
ListRepositoriesResponse(26-26)ServiceError(33-33)SearchCommitsRequest(35-35)SearchCommitsResponse(36-36)packages/mcp/src/env.ts (1)
env(8-25)packages/web/src/features/search/types.ts (1)
SearchCommitsRequest(178-178)packages/mcp/src/utils.ts (1)
isServiceError(4-10)packages/mcp/src/schemas.ts (1)
searchCommitsResponseSchema(219-219)
packages/web/src/features/search/gitApi.test.ts (1)
packages/web/src/features/search/gitApi.ts (1)
searchCommits(98-197)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (24)
packages/shared/src/index.server.ts (1)
29-32: Re-exporting server cache dir constants via shared index looks goodThe new
REPOS_CACHE_DIR/INDEX_CACHE_DIRre-exports are consistent with the rest of this module and give consumers a single canonical import path; no issues spotted.packages/mcp/README.md (1)
220-252:search_commitsand date-format documentation align well with the implementationThe
search_commitsdescription (commit-time filtering, repoId as numeric ID or full repo name, cloning requirement) and the Date Format Examples block match the behavior in the git API and date utilities, and clearly differentiate index-time vs commit-time semantics.packages/mcp/CHANGELOG.md (1)
10-24: Changelog entries comprehensively capture the new temporal toolingThe Unreleased “Added” section clearly lists temporal parameters,
search_commits, relative date support, git timeouts, and the associated tests/docs, which should make this feature set easy to track in release notes.packages/web/src/features/search/gitApi.test.ts (1)
98-383: Test coverage forsearchCommitsoptions, errors, and repo resolution is excellentThe suite thoroughly exercises date-range validation,
toGitDatewiring, git log options (since/until/author/query/maxCount), success paths (non-empty and empty), detailed error mapping (not-a-git-repo, ambiguous argument, timeout, generic errors), disk presence, git client config, and numeric vs string repo identifiers. This should give high confidence in the new temporal commit-search behavior.Also applies to: 418-574
packages/web/src/features/search/types.ts (3)
92-94: LGTM! Clear documentation of indexed-time vs. commit-time filtering.The comments effectively distinguish indexed date filtering from commit-time filtering, which will help users choose the right tool for their needs.
109-109: LGTM!The optional boolean field is appropriately typed and documented.
170-177: LGTM! Schema validation is robust.The maxCount validation properly constrains the value (int, positive, max 500), and the dual repoId support (number | string) aligns with the MCP schemas.
packages/mcp/src/client.ts (3)
2-3: LGTM!The imports are correctly updated to support the new searchCommits functionality.
24-31: LGTM! Clean parameter handling.The optional params object is properly typed and query parameters are safely constructed using URLSearchParams.
67-83: LGTM! Consistent implementation pattern.The searchCommits function follows the same structure and error handling as the other client functions, ensuring consistency across the API surface.
packages/mcp/src/schemas.ts (4)
1-1: LGTM!The updated sync reference correctly points to the types.ts file.
30-32: LGTM! Consistent with web layer schemas.The searchOptionsSchema extensions match the definitions in packages/web/src/features/search/types.ts, maintaining consistency across the codebase.
148-148: LGTM!The isBranchFilteringEnabled field is consistent with the web layer schema.
200-219: LGTM! Well-structured commit schemas.The searchCommitsRequestSchema is consistent with the web layer, and the commitSchema appropriately captures all essential commit metadata fields.
packages/mcp/src/index.ts (5)
8-8: LGTM!The searchCommits import is correctly added to support the new tool.
52-97: LGTM! Clear parameter documentation and proper forwarding.The new temporal parameters are well-documented, correctly emphasizing that since/until filter by indexed date (not commit time), and are properly passed through to the underlying search function.
181-225: LGTM! Comprehensive implementation addressing previous feedback.The search_commits tool properly supports dual repoId types (number | string) as discussed in previous reviews, clearly distinguishes commit-time filtering from index-time filtering, and includes helpful guidance about repository cloning requirements.
227-236: LGTM! Excellent clarification of temporal filtering semantics.The documentation clearly distinguishes indexed-time filtering (indexedAt) from commit-time filtering and directs users to the appropriate tool for each use case.
237-329: LGTM! Thoughtful conditional metadata inclusion.The temporal filtering parameters are properly integrated, and the lastIndexed field is only shown when temporal filters are active, keeping the output concise for non-temporal queries.
packages/web/src/features/search/gitApi.ts (5)
12-21: LGTM! Appropriate git client configuration.The 30-second timeout and concurrent process limit (6) provide good resource management and prevent hanging operations.
32-59: LGTM! Secure repository resolution with org scoping.The function properly handles dual identifier types and importantly scopes the lookup by orgId, ensuring users can only access repositories in their organization.
61-78: LGTM! Well-defined interfaces.The interfaces accurately represent the commit search contract and align with the schemas defined elsewhere in the codebase.
98-134: LGTM! Comprehensive validation and secure path handling.The function properly implements auth guards (withOptionalAuthV2), validates inputs, and constructs repository paths securely. The error messages provide helpful guidance about repository cloning status.
135-197: LGTM! Robust error handling with actionable messages.The git log options are correctly constructed, and the error handling covers common failure modes (invalid repo, ambiguous dates, timeout) with specific, helpful error messages. The type assertion at line 160 is reasonable given simple-git's output structure.
552a4e9 to
310e74a
Compare
|
Hey @waynesun09 sorry for the delay - I will get to this tomorrow |
310e74a to
4304edd
Compare
|
@brendan-kellam no worries, I've rebased to latest main branch |
Add temporal filtering capabilities for searches by git branch/revision and repository index dates (since/until). Integrates with the refactored QueryIR-based search architecture. - Add gitRevision, since, until parameters to SearchOptions - Implement temporal repo filtering by indexedAt field - Add branch filtering via QueryIR wrapper - Add search_commits MCP tool for commit-based searches - Update list_repos with activeAfter/activeBefore filtering - Add 88 new tests (all passing) Signed-off-by: Wayne Sun <gsun@redhat.com>
4304edd to
ba4c8a5
Compare
|
@brendan-kellam any free time to check, thanks? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@packages/web/src/actions.ts`:
- Around line 463-495: In getRepos, validate each individual
activeAfter/activeBefore by calling toDbDate and checking its return before
building temporalWhere: if toDbDate(activeAfter) or toDbDate(activeBefore)
returns undefined, throw a 400-level error with a clear message (e.g., "invalid
activeAfter date") instead of letting the query silently ignore the filter; keep
the existing validateDateRange usage for the pairwise check and perform these
per-date validations immediately after importing toDbDate/validateDateRange and
before assigning temporalWhere.indexedAt.
🧹 Nitpick comments (1)
packages/web/src/features/search/searchApi.ts (1)
144-150: Consider static import forvalidateDateRange.The
validateDateRangefunction is dynamically imported whiletoDbDatefrom the same module is statically imported at the top of the file. For consistency and slight performance improvement (avoiding async import overhead), consider addingvalidateDateRangeto the static import on line 10.♻️ Suggested simplification
-import { toDbDate } from './dateUtils'; +import { toDbDate, validateDateRange } from './dateUtils';Then in
getTemporallyFilteredRepos:// Validate date range if both dates are provided if (since && until) { - const { validateDateRange } = await import('./dateUtils'); const dateRangeError = validateDateRange(since, until);
brendan-kellam
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@waynesun09 my apologies for taking so long to get to this review. I think the addition of the /api/commits api and the search_commits tool makes a lot of sense, and is valuable. I do however think that the temporal filtering for code search and listing repos being keyed on the indexedAt date is less valuable. IMO the date we want to be looking at is the pushedAt date, which would track the date at which a repository last had a commit made to it, across all branches.
In a effort to get this PR in asap, I've addressed the feedback I've left in this review and removed the temporal filtering for code search and listing repos. In a follow-up PR, I'll add support for the pushedAt date and add back the filtering for code search and listing repos. Thank you for getting this effort started
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
packages/shared/src/utils.ts (1)
54-60: Same unreachable code pattern as inloadConfig.After the retry loop,
lastErrorwill always be set if we reach this point, making line 60 unreachable.Proposed fix
// If we've exhausted all retries, throw the last ENOENT error - if (lastError) { - throw lastError; - } - - throw new Error('Failed to load file after all retry attempts'); + throw lastError!; }packages/mcp/src/index.ts (2)
52-80: EscapegitRevisionbefore injecting into the query.Unescaped revisions can break the zoekt query if they contain regex/meta characters (e.g.,
~,^,+). Use the same escaping approach as repo IDs.🛠️ Suggested fix
- if (gitRevision) { - query += ` ( rev:${gitRevision} )`; - } + if (gitRevision) { + query += ` ( rev:${escapeStringRegexp(gitRevision)} )`; + }
208-216:list_reposMCP tool still lacks temporal filters.The PR objectives mention
activeAfter/activeBefore, but the MCP tool schema still only acceptsquery,pageNumber, andlimit. Without exposing these parameters (and forwarding them inlistRepos), MCP clients can’t use the new temporal filtering.
🤖 Fix all issues with AI agents
In `@packages/mcp/src/client.ts`:
- Around line 58-67: searchCommits currently sends a hard-coded header
'X-Org-Domain': '~' which is inconsistent with other endpoints and can mis-scope
multi-org requests; remove that header from the headers object in the export
const searchCommits function (or replace it with a configurable value like
env.SOURCEBOT_ORG_DOMAIN if your backend requires it), ensuring you still
preserve the existing conditional inclusion of env.SOURCEBOT_API_KEY and
Content-Type header when building the fetch request.
In `@packages/shared/src/env.server.ts`:
- Around line 97-104: The final unconditional throw new Error('Failed to load
config after all retry attempts') is unreachable because lastError will always
be set and thrown above; remove the redundant throw or replace the two-step
logic with a single clear outcome (e.g., if (lastError) throw lastError; else
throw new Error('Failed to load config after all retry attempts')) so only one
reachable throw remains—update the retry-exhaustion block around lastError to
eliminate the unreachable throw.
In `@packages/shared/src/utils.ts`:
- Around line 97-103: Wrap the construction of new URL(repo.cloneUrl) in
defensive handling: validate or try/catch around new URL in the function in
packages/shared/src/utils.ts so a malformed repo.cloneUrl does not throw; when
repo.external_codeHostType === 'genericGitHost' and the URL parse fails, treat
it as non-file (or return a safe fallback) and ensure the function returns a
consistent shape (e.g., not throwing) instead of allowing a TypeError from new
URL to propagate. Use the cloneUrl variable and the existing conditional
(repo.external_codeHostType === 'genericGitHost' && cloneUrl.protocol ===
'file:') as the location to add this handling.
🧹 Nitpick comments (2)
packages/shared/src/env.server.ts (1)
64-70: Consider adding a timeout for remote config fetches.The
fetchcall for remote config files has no timeout, which could cause the application to hang indefinitely if the remote server is unresponsive.Proposed fix using AbortSignal.timeout
if (isRemotePath(configPath)) { - const response = await fetch(configPath); + const response = await fetch(configPath, { + signal: AbortSignal.timeout(30000), // 30 second timeout + }); if (!response.ok) { throw new Error(`Failed to fetch config file ${configPath}: ${response.statusText}`); } return response.text();packages/shared/src/utils.ts (1)
91-93: Update the TODO comment — this IS the shared package.The comment states the function is "duplicated from the backend's utils.ts" and suggests moving it to a shared package, but this file is already in the shared package. The comment appears to be outdated from when the code was copied over.
Proposed fix
-// `@todo`: this is duplicated from the `getRepoPath` function in the -// backend's `utils.ts` file. Eventually we should move this to a shared -// package. +// Resolves the filesystem path for a repository. +// For local file:// repositories, returns the pathname directly (read-only). +// For remote repositories, returns the path under DATA_CACHE_DIR/repos/<id>. export const getRepoPath = (repo: Repo): { path: string, isReadOnly: boolean } => {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@packages/mcp/CHANGELOG.md`:
- Around line 10-12: Fix the minor typo in the CHANGELOG entry under the "Added"
section where the `search_commits` tool description reads "a repos commit
history" — change it to "a repo's commit history" so the possessive is correct
and the entry references the `search_commits` tool and its description
precisely.
♻️ Duplicate comments (1)
packages/shared/src/utils.ts (1)
90-107: Unhandled exception risk fromnew URL(repo.cloneUrl)remains unresolved.This issue was flagged in a previous review.
new URL(repo.cloneUrl)at line 93 will throw aTypeErrorif the clone URL is malformed or uses SSH/SCP-style format (e.g.,git@github.com:org/repo.git). Based on learnings, SSH/SCP-style Git URLs are known to cause this issue and require special handling.The previously proposed fix wrapping this in a try/catch should be applied to prevent runtime failures.
🧹 Nitpick comments (1)
packages/mcp/src/schemas.ts (1)
206-214: Consider using camelCase for field naming consistency.The response schema uses
author_nameandauthor_email(snake_case), while the rest of the codebase predominantly uses camelCase (e.g.,byteOffset,lineNumber,contentStart,matchRanges). This may be intentional if it mirrors the git log output format directly, but it creates an inconsistency in the API surface.♻️ Optional: Normalize to camelCase
export const searchCommitsResponseSchema = z.array(z.object({ hash: z.string(), date: z.string(), message: z.string(), refs: z.string(), body: z.string(), - author_name: z.string(), - author_email: z.string(), + authorName: z.string(), + authorEmail: z.string(), }));If the snake_case naming is intentional to match the git log format or external contract, please disregard this suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
search_commitstool to search a repos commit history.gitRevisionparameter to thesearch_codetool to allow for searching on different branches.Fixes #511