Skip to content

[docs] Add AGENTS.md - AI agent coding guide#2922

Open
vaibhavk1992 wants to merge 3 commits intoapache:mainfrom
vaibhavk1992:docs/add-agents-md-guide
Open

[docs] Add AGENTS.md - AI agent coding guide#2922
vaibhavk1992 wants to merge 3 commits intoapache:mainfrom
vaibhavk1992:docs/add-agents-md-guide

Conversation

@vaibhavk1992
Copy link

This documentation extracts and documents coding conventions, patterns, and standards from the existing Apache Fluss codebase to assist AI coding agents. All rules and examples are derived from actual source code, Checkstyle configuration, and build files.

  • 11 comprehensive sections covering critical rules, API patterns, testing, dependencies, configuration, and build/CI
  • 100+ concrete code examples with DO/DON'T comparisons
  • Direct file references to canonical examples in the codebase
  • Fully compliant with Apache generative AI guidelines

Also updated .gitignore to exclude CLAUDE.md (personal development notes)

Purpose

Linked issue: #2921

Brief change log

Tests

API and Format

Documentation

This documentation extracts and documents coding conventions, patterns, and
standards from the existing Apache Fluss codebase to assist AI coding agents.
All rules and examples are derived from actual source code, Checkstyle
configuration, and build files.

- 11 comprehensive sections covering critical rules, API patterns, testing,
  dependencies, configuration, and build/CI
- 100+ concrete code examples with DO/DON'T comparisons
- Direct file references to canonical examples in the codebase
- Fully compliant with Apache generative AI guidelines

Also updated .gitignore to exclude CLAUDE.md (personal development notes)
@qzyu999
Copy link

qzyu999 commented Mar 24, 2026

Hi @vaibhavk1992, thanks for the initial PR! Just sharing some quick observations. I took a look and it looks quite solid with lots of useful information for coding agents. For comparison I examined Airflow's AGENTS.md (https://github.com/apache/airflow/blob/main/AGENTS.md).

I noticed that in Airflow's AGENTS.md they have a section that tells the AI how to properly run git commands in addition to identifying as AI-generated: https://github.com/apache/airflow/blob/main/AGENTS.md#creating-pull-requests

There's also a section of explicit boundaries for the agent that may have some overlap for Fluss: https://github.com/apache/airflow/blob/main/AGENTS.md#boundaries

I noticed that the AGENTS.md files here and in Airflow both focus primarily on code contribution rather than deployment. I've read that AGENTS.md could serve a dual purpose: guiding contributors, and helping developers use AI to quickly set up and deploy a new repository from day one.

However, I do believe the saying is that AGENTS.md is intended to be relatively brief (<500 lines?), and this document has already exceeded that. I am not sure how this could be best balanced.

Thanks again for your work on this.

Edit: A workaround for the lengthier AGENTS.md is to provide a breadcrumbs route, pointing the agent towards further documentation in other subfolders for additional information (e.g., /docs/contributing_code.md). Additionally, (I think it was mentioned elsewhere) it's possible to have AGENTS.md nested in submodules.

Edit 2: I just saw here that module-level AGENTS.md would be in phase 3: https://cwiki.apache.org/confluence/display/FLUSS/FIP-34%3A+Making+Fluss+an+AI-Native+Project

vaibhav kumar and others added 2 commits March 26, 2026 16:09
Restore trailing newline at end of .gitignore file to match
apache/fluss upstream. Previous commit accidentally removed it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vaibhavk1992
Copy link
Author

@qzyu999 Thanks for the reviewing it.
Agree the file size should be smaller, there were some redundant examples and code chunks. I have compressed it under <500 Lines by still keeping all the checks. Also added the missing airflow context too.


### Package Structure

See CLAUDE.md for full module/package organization. Key modules: `fluss-common`, `fluss-rpc`, `fluss-client`, `fluss-server`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a CLAUDE.md in the repo, actually if you check with Airflow they do have a CLAUDE.md, but it simply points to AGENTS.md in its contents: https://github.com/apache/airflow/blob/main/CLAUDE.md. I am thinking we could do the same.


## 10. Module Boundaries

**Module structure:** See CLAUDE.md for full module organization
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as my previous comment above, no visible CLAUDE.md.


Detailed explanation of changes and motivation.

Co-Authored-By: Claude <ai-assistant@anthropic.com>
Copy link

@qzyu999 qzyu999 Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should make this doc agent agnostic, Airflow's AGENT.md mentions <Agent Name and Version> and links to an explicit separate post about Gen-AI assissted contributions: https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions. Perhaps we need our own such Fluss-specific post?

Edit: I just saw here the the AI-assisted contribution guidelines are phase 2: https://cwiki.apache.org/confluence/display/FLUSS/FIP-34%3A+Making+Fluss+an+AI-Native+Project


**Component tags:** `[client]`, `[server]`, `[rpc]`, `[flink]`, `[spark]`, `[docs]`, `[build]`, `[test]`

**AI-generated code identification:** ALWAYS include `Co-Authored-By: Claude <ai-assistant@anthropic.com>` in commit messages for AI-generated changes.
Copy link

@qzyu999 qzyu999 Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, referring to being agent-agnostic with <Agent Name and Version>.

@qzyu999
Copy link

qzyu999 commented Mar 26, 2026

Hi @vaibhavk1992, awesome job on the compression! I noted some examples where it's specifically referring to some form of "Claude", when I think we should try to be more agent agnostic. I refer to some examples in the Airflow AGENTS.md, and even how they add a CLAUDE.md where its contents simply point to AGENTS.md. Perhaps we can do the same.

Edit: I just saw here https://lists.apache.org/thread/xm35s36fsqt8dyhbkkvq05nwm7l48rp2 that this was mentioned explicitly:

Following the ASF Generative Tooling Guidance [1], add an AI disclosure
section to our PR template:

- An AI disclosure checkbox
- A Generated-by: <Tool Name and Version> tag

The link also mentions the same CLAUDE.md approach I mentioned in Airflow's repo, along with having Airflow as one of the leading examples going this route. You can possibly model it closer then to Airflow's AI disclosure and see what others have to say.

@vaibhavk1992
Copy link
Author

@qzyu999 I use claude.md for my local setup. Too make it agnostic I made an agents.Md. As per FIP and this task it needs just an Agents.md. I can get rid of claude.md wherever it is being referred. Please confirm.

@qzyu999
Copy link

qzyu999 commented Mar 26, 2026

@qzyu999 I use claude.md for my local setup. Too make it agnostic I made an agents.Md. As per FIP and this task it needs just an Agents.md. I can get rid of claude.md wherever it is being referred. Please confirm.

Hi @vaibhavk1992, actually it mentions here to keep the CLAUDE.md, but simply symlink (via ln -s AGENTS.md CLAUDE.md) it to AGENTS.md: https://cwiki.apache.org/confluence/display/FLUSS/FIP-34%3A+Making+Fluss+an+AI-Native+Project I use Claude indirectly through a router so I am not too familiar with this problem.

Edit: I also mention above that the repo should have a CLAUDE.md itself, but it pretty much is used for just the symlink so we don't duplicate text. You can see Airflow's example here: https://github.com/apache/airflow/blob/main/CLAUDE.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants