-
Notifications
You must be signed in to change notification settings - Fork 775
[ADMIN] Adding an MDAnalysis AI tools policy. #5210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Adds a markdown document that defines MDAnalysis' current stance on AI tools for contributions.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #5210 +/- ##
========================================
Coverage 92.72% 92.72%
========================================
Files 180 180
Lines 22475 22475
Branches 3190 3190
========================================
Hits 20841 20841
Misses 1177 1177
Partials 457 457 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
RMeli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @IAlibay for getting this started!
First pass. I'll post comments about content in the coming days.
Co-authored-by: Rocco Meli <r.meli@bluemail.ch>
orbeckst
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to also state that we're open to discussions and expect the policy to be periodically reviewed.
I also have one smaller edit.
Thank you very much for drafting @IAlibay !
AI_POLICY.md
Outdated
| AI assitance is deemed acceptable. However, if code generate exceeds minimal, sporadic amounts (e.g. repeated or large multi-line blocks), | ||
| it would be considered fully AI-generated and, as defined in section #1, is not acceptable. | ||
|
|
||
| As per section #0, where possible please state that you are using AI assistance via an IDE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove "where possible" as this puts the burden on us. Instead, it should be a contributors responsibility to know when they are using AI features.
| As per section #0, where possible please state that you are using AI assistance via an IDE. | |
| As per section #0, please state when you are using AI assistance via an IDE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With IDEs, my main worry is that it's just easy to forget - like I'm a vim guy, but sometimes I use vscode and then the auto complete happens but I forget about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still on the user to know. I don't think the policy is the place to say "we know, you're human, so we are putting qualifiers on our requirements". We should say what we want to happen.
A checklist will make this easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this may be something to discuss on Friday. For now, I have pushed a different variant of that text which hopefully may offer a middle ground.
tylerjereddy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, the sklearn PR template currently has this:
AI usage disclosure
I used AI assistance for:
- Code generation (e.g., when writing an implementation or fixing a bug)
- Test/benchmark generation
- Documentation (including examples)
- Research and understanding
I really like this approach. It gives people more of a chance to acknowledge usage of AI in different aspects where they might initially say "no AI" for a single response. Potentially saves us some time in review. |
|
P.S. I'm going to merge in grammatical corrections, but not content just yet - that way everyone gets a chance to comment on the same thing. |
Correcting some typos found during review.
|
In addition to the suggested disclosure categories from #5210 (review)
I'd also add a category on tooling/CI. Even if this is more a coredev thing to touch, we should be clear ourselves. |
I'll be honest, this makes for a really large disclosure list in the PR template. Is it really needed? It's not like we allow for much nuance in the current draft of the policy. |
That's a fair comment. Given that we'll have a larger discussion soon, I can certainly see not including something like the disclosures in the first round and start with a strict policy. Longer term I am convinced that we will have to adapt a more nuanced view, simply because for routine tasks, genAI use is just so much more efficient when wielded by a knowledgeable person. I would not want to ham-string our own coredevs. I assume that in the future we'll adopt a more nuanced view. Perhaps it helps if I offer my current (and evolving) views on these individual points. For all these points I expect someone using AI to actually understand the issue and they should have been able to do it themselves. We reserve the "humans are right"-prerogative for discussions and decisions. With these caveats, I don't see AI use for Research/Understanding a problem. For boiler-plate docs I don't see an issue, as long as we're not generating whole novels with original content, and as long as submitters actually proof-read what they submit. I'd also argue that updating boiler-plate code in workflows and tooling can be more efficiently done with AI help — all of this areas where we are stretched quite thin. Test and benchmark generation is a bit more involved. We have to ask what this policy should do: Is it supposed to shield coredevs from having to deal with low-quality and low-effort code contributions or is it supposed to maintain license integrity of these parts of the code? The answer to this question is important in how to approach it. |
|
@MDAnalysis/coredevs we decided to merge an initial version of the AI policy by Tuesday 2026-02-03 (next week) — see today's business meeting notes for details. The initial version is a baseline and will evolve in the future. Please review and comment and if possible approve (with the understanding that we will come back and update, based on discussion on this PR and elsewhere). |
|
(@IAlibay volunteered to be in charge of making sure that the PR gets merged. 🙏 ) |
orbeckst
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see a couple of changes incorporated (version + date, statement that the document will change, contact details) before approving.
Other suggestions should be uncontroversial (spelling) or are really just suggestions.
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
|
I think all your comments have been addressed @orbeckst |
orbeckst
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @IAlibay — v1.0 lgtm!
RMeli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A few minor comments, but it can be merged IMO. Thanks for all the work!
|
|
||
| ### 5. Human reviewers are required | ||
|
|
||
| All code merged into MDAnalysis repositories must be reviewed by a human reviewer. Instructions / suggestions from human reviewers always take precedence over those of non-human reviewers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we suggest/ask of not adding Copilot as a reviewer, or that is something we would be OK having?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong view on this, personally folks can do what they want, I just don't want to be second guessed by AI.
Co-authored-by: Rocco Meli <r.meli@bluemail.ch>
Co-authored-by: Rocco Meli <r.meli@bluemail.ch>
A policy for AI generated tools.
LLM / AI generated code disclosure
A likely-LLM powered tool in google docs was used for spell checking and pointing out gramatical issues.
PR Checklist
package/CHANGELOGfile updated?package/AUTHORS? (If it is not, add it!)Developers Certificate of Origin
I certify that I can submit this code contribution as described in the Developer Certificate of Origin, under the MDAnalysis LICENSE.
📚 Documentation preview 📚: https://mdanalysis--5210.org.readthedocs.build/en/5210/