-
Notifications
You must be signed in to change notification settings - Fork 15
fix: resolve repository analysis and PR creation issues #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: resolve repository analysis and PR creation issues #35
Conversation
naaa760
commented
Dec 27, 2025
- Fix YAML indentation in rule generation causing same rules for all repos
- Add user_token field to RepositoryAnalysisRequest model
- Implement authentication-aware caching to prevent cache contamination
- Enhance PR URL validation to prevent 404 redirects
- Simplify YAML generation by removing textwrap dependency
- Ensure all pre-commit checks pass
- Fix YAML indentation in rule generation causing same rules for all repos - Add user_token field to RepositoryAnalysisRequest model - Implement authentication-aware caching to prevent cache contamination - Enhance PR URL validation to prevent 404 redirects - Simplify YAML generation by removing textwrap dependency - Ensure all pre-commit checks pass
Summary of ChangesHello @naaa760, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses several critical issues impacting repository analysis and pull request creation. It rectifies YAML formatting errors that led to incorrect rule application, enhances the "RepositoryAnalysisRequest" model to support authenticated operations, and improves overall system robustness through better caching and URL validation. The changes aim to ensure more accurate and reliable processing of repository data and PR-related tasks. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (50.0%) is below the target coverage (80.0%). You can increase the patch coverage or adjust the target coverage. @@ Coverage Diff @@
## main #35 +/- ##
=====================================
Coverage 32.8% 32.8%
=====================================
Files 85 85
Lines 5162 5162
=====================================
Hits 1698 1698
Misses 3464 3464 Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively resolves a critical issue with YAML indentation during rule generation, which was causing incorrect behavior. The removal of the textwrap dependency and the direct use of correctly formatted YAML strings simplifies the code. The addition of the user_token field is a good step towards more flexible authentication.
My review includes two main points:
- A high-severity recommendation to use Pydantic's
SecretStrfor the newuser_tokento enhance security, and a note that this new field is not yet integrated into the workflow. - A medium-severity suggestion to refactor the rule generation logic to use Python dictionaries instead of hardcoded YAML strings, which would improve maintainability and robustness.
Overall, these are solid fixes. Addressing the feedback will further improve the security and quality of the code.
| repository_url: str | None = Field(default=None, description="GitHub repository URL") | ||
| repository_full_name: str | None = Field(default=None, description="Full repository name (owner/repo)") | ||
| installation_id: int | None = Field(default=None, description="GitHub App installation ID") | ||
| user_token: str | None = Field(default=None, description="User token for GitHub operations (optional)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For security, sensitive values like tokens should be stored in Pydantic's SecretStr type to prevent them from being exposed in logs or other string representations of the model. You will need to add SecretStr to your pydantic imports.
Additionally, this new user_token field is not currently being used in the agent's workflow. For it to be effective, it needs to be passed down through RepositoryAnalysisState to the various github_client API calls. This involves:
- Adding
user_tokentoRepositoryAnalysisState. - Initializing it in
RepositoryAnalysisAgent.execute. - Passing it to
github_clientmethods in the analysis nodes.
When accessing the token value from the SecretStr field, you'll need to use .get_secret_value().
| user_token: str | None = Field(default=None, description="User token for GitHub operations (optional)") | |
| user_token: SecretStr | None = Field(default=None, description="User token for GitHub operations (optional)") |
| yaml_rule="""description: "Ensure PRs include context" | ||
| enabled: true | ||
| severity: low | ||
| severity: low | ||
| event_types: | ||
| - pull_request | ||
| parameters: | ||
| min_description_length: 50 | ||
| """ | ||
| ).strip(), | ||
| parameters: | ||
| min_description_length: 50 | ||
| """.strip(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the YAML string is now correctly formatted, generating YAML by hardcoding multiline strings can be fragile and hard to maintain. A more robust approach is to define rules as Python dictionaries and then serialize them to YAML. This eliminates the risk of manual indentation or syntax errors.
This would also simplify the overall flow, as you are currently creating a YAML string, which is then parsed back into a dictionary in _render_rules_yaml and validate_recommendations.
Consider this alternative approach:
import yaml
rule_dict = {
"description": "Ensure PRs include context",
"enabled": True,
"severity": "low",
"event_types": ["pull_request"],
"parameters": {
"min_description_length": 50,
},
}
recommendations.append(
RuleRecommendation(
yaml_rule=yaml.dump(rule_dict, sort_keys=False),
confidence=desc_confidence,
reasoning=desc_reasoning,
strategy_used="static",
)
)This would likely require adjusting RuleRecommendation or how it's used, but it would make the code more robust and easier to manage in the long term.