Skip to content

Conversation

@naaa760
Copy link
Contributor

@naaa760 naaa760 commented Dec 15, 2025

What’s included

  • Repository analysis agent inspects repo structure, recent PRs, and CONTRIBUTING to propose diff-aware Watchflow rules (YAML + PR plan).
  • GitHub client gains helpers to read repo metadata, create branches, commit files, and open PRs via installation or user tokens.
  • FastAPI endpoints:
    • POST /api/v1/rules/recommend → analyze repo, return rules YAML + PR plan
    • POST /api/v1/rules/recommend/proceed-with-pr → create branch, write .watchflow/rules.yaml, open PR
  • Docs updated with “Repository Analysis → One-Click PR.”
  • Tests for URL normalization and proceed-with-PR happy path + auth guard.

How to test

  • Unit: pytest tests/unit/agents/test_repository_analysis_models.py tests/unit/api/test_proceed_with_pr.py -q
  • Manual: call /api/v1/rules/recommend with repository_url or repository_full_name, then call /api/v1/rules/recommend/proceed-with-pr with the returned rules_yaml plus installation_id or user_token.

@naaa760 naaa760 requested a review from dkargatzis as a code owner December 15, 2025 07:32
@gemini-code-assist
Copy link

Summary of Changes

Hello @naaa760, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a powerful new feature that automates the process of generating and deploying Watchflow rules. By analyzing a given GitHub repository's structure and history, the system can now recommend relevant rules and, with a single action, create a pull request to integrate these rules directly into the repository. This significantly reduces the manual effort required to set up and maintain code quality and process enforcement, making it easier for teams to adopt and benefit from Watchflow.

Highlights

  • Automated Watchflow Rule Generation: Introduced a new capability for automated repository analysis that inspects repository structure, recent pull requests, and contributing guidelines to propose diff-aware Watchflow rules.
  • One-Click Pull Request Creation: Added functionality to automatically create a new branch, commit the recommended Watchflow rules (as a .watchflow/rules.yaml file), and open a pull request with a pre-filled body, streamlining the adoption of rules.
  • New API Endpoints: Implemented two new FastAPI endpoints: POST /api/v1/rules/recommend for analyzing a repository and returning rule recommendations, and POST /api/v1/rules/recommend/proceed-with-pr to trigger the automated PR creation process.
  • Enhanced GitHub Client: The GitHub client (src/integrations/github/api.py) has been significantly extended with helper functions to read repository metadata, create Git references (branches), commit files, and open pull requests, supporting both GitHub App installations and user tokens.
  • Repository Analysis Agent Refactoring: The RepositoryAnalysisAgent has been refactored to simplify its internal workflow, moving from a LangGraph-orchestrated state machine to a more procedural execution flow for clarity and direct state mutation. It now uses a deterministic approach for generating default rule recommendations.
  • Documentation and Testing: Updated the docs/features.md to reflect the new 'Repository Analysis → One-Click PR' feature and added new unit tests for URL normalization in repository analysis models and for the proceed-with-pr API endpoint.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@codecov-commenter
Copy link

codecov-commenter commented Dec 15, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 51.40845% with 138 lines in your changes missing coverage. Please review.

❌ Your patch status has failed because the patch coverage (51.4%) is below the target coverage (80.0%). You can increase the patch coverage or adjust the target coverage.
❌ Your project status has failed because the head coverage (33.3%) is below the target coverage (80.0%). You can increase the head coverage or adjust the target coverage.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@           Coverage Diff           @@
##            main     #31     +/-   ##
=======================================
+ Coverage   32.0%   33.3%   +1.2%     
=======================================
  Files         85      85             
  Lines       5001    5048     +47     
=======================================
+ Hits        1604    1684     +80     
+ Misses      3397    3364     -33     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0b55233...6d68ed5. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant new feature for automated repository analysis and one-click pull request creation for Watchflow rules. The changes are extensive, refactoring the RepositoryAnalysisAgent to a simpler procedural flow, enhancing data models with Pydantic for better validation, and extending the GitHub client to support the new functionality with both installation and user tokens. The new API endpoints are well-structured and include tests. I've provided a few suggestions to improve logging for better debuggability, simplify data models by removing redundant validators, and align more closely with the GitHub API documentation for status code checks. Overall, this is a solid contribution that adds valuable capabilities.

Comment on lines 52 to 59
except Exception as exc: # noqa: BLE001
latency_ms = int((time.perf_counter() - started_at) * 1000)
return AgentResult(
success=False,
message=f"Repository analysis failed: {str(e)}",
message=f"Repository analysis failed: {exc}",
data={},
metadata={
"execution_time_ms": execution_time * 1000,
"error_type": type(e).__name__,
},
metadata={"execution_time_ms": latency_ms},
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This except block catches a broad Exception, which can make debugging difficult. The noqa comment indicates this is intentional, but it's crucial to log the full traceback to diagnose issues. Currently, there's no logger instance in this file to do so. I recommend re-introducing the logger (e.g., import logging; logger = logging.getLogger(__name__) at the module level) and using logger.error with exc_info=True to capture the full context of the failure. This will significantly improve debuggability.

Comment on lines +113 to +117
@model_validator(mode="after")
def populate_full_name(self) -> "RepositoryAnalysisRequest":
if not self.repository_full_name and self.repository_url:
self.repository_full_name = parse_github_repo_identifier(self.repository_url)
return self

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This @model_validator seems redundant. The @field_validator for repository_full_name on line 100 already handles populating this field from repository_url if it's not provided. This makes the logic a bit harder to follow. You can simplify the model by removing this validator.

Comment on lines +169 to +173
@model_validator(mode="after")
def populate_full_name(self) -> "ProceedWithPullRequestRequest":
if not self.repository_full_name and self.repository_url:
self.repository_full_name = parse_github_repo_identifier(self.repository_url)
return self

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to RepositoryAnalysisRequest, this @model_validator is redundant. The @field_validator for repository_full_name on line 161 already populates the field from repository_url. Removing this validator will make the model simpler and easier to maintain.

Comment on lines +29 to +48
async def analyze_repository_structure(state: RepositoryAnalysisState) -> None:
"""Collect repository metadata and structure signals."""
repo = state.repository_full_name
installation_id = state.installation_id

repo_data = await github_client.get_repository(repo, installation_id=installation_id)
workflows = await github_client.list_directory_any_auth(
repo_full_name=repo, path=".github/workflows", installation_id=installation_id
)
contributors = await github_client.get_repository_contributors(repo, installation_id) if installation_id else []

state.repository_features = RepositoryFeatures(
has_contributing=False,
has_codeowners=bool(await github_client.get_file_content(repo, ".github/CODEOWNERS", installation_id)),
has_workflows=bool(workflows),
workflow_count=len(workflows or []),
language=(repo_data or {}).get("language"),
contributor_count=len(contributors),
pr_count=0,
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This function and others in this file make external API calls but lack logging. The previous implementation had logging for each step, which is very useful for debugging and monitoring the agent's progress. I recommend adding structured logging back into these node functions to provide visibility into the analysis process, especially for successes, failures, and key findings (e.g., number of workflows found). This would greatly improve observability.

payload = {"ref": f"refs/heads/{ref.lstrip('refs/heads/')}", "sha": sha}
session = await self._get_session()
async with session.post(url, headers=headers, json=payload) as response:
return response.status in (200, 201)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The success check response.status in (200, 201) includes 200 OK. However, the GitHub API documentation for creating a git ref (POST /repos/{owner}/{repo}/git/refs) only specifies 201 Created as a success status code. To align with the documentation and avoid ambiguity, it's best to only check for 201.

Suggested change
return response.status in (200, 201)
return response.status == 201

payload = {"title": title, "head": head, "base": base, "body": body}
session = await self._get_session()
async with session.post(url, headers=headers, json=payload) as response:
if response.status in (200, 201):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The success check response.status in (200, 201) includes 200 OK. The GitHub API documentation for creating a pull request (POST /repos/{owner}/{repo}/pulls) specifies 201 Created as the success status. To align with the documentation, it would be clearer to only check for 201.

Suggested change
if response.status in (200, 201):
if response.status == 201:

@dkargatzis dkargatzis merged commit 57b13e1 into warestack:main Dec 18, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants