Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
194 changes: 194 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,194 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# AGENTS.md

Apache HBase is a distributed, scalable big data store built on HDFS and cloud
object storage.

## Repo Structure

This is a multi-module Maven project. Modules live in arbitrarily nested
folders; enumerate them by searching for `pom.xml` files (excluding `target/`
directories). The root `pom.xml` defines the full reactor and build order.
Note that some directories from removed or merged modules (e.g.,
`hbase-hadoop2-compat/`, `hbase-protocol/`, `hbase-rsgroup/`) may still exist
as empty shells with only `target/` remnants. If a directory has no `pom.xml`,
it is not part of the active build.

### Client and Server

The fundamental divide in this codebase is client-side vs. server-side, with
several modules shared between them.

- `hbase-client` -- The client library. Builds RPC requests, handles retries,
manages connections. This is the public API that external consumers depend on.
- `hbase-server` -- RegionServer and Master implementations. Processes RPCs,
manages regions, stores data. The largest module by far.
- Shared modules like `hbase-common`, `hbase-protocol-shaded`, and
`hbase-metrics-api` are dependencies of both sides.

When orienting on unfamiliar code, first determine which side of this divide
you are on.

### Module Roles

**Core data path:**
`hbase-client` -> `hbase-server` (via protobuf RPCs defined in
`hbase-protocol-shaded`)

**Gateways** (alternative client entry points):
`hbase-rest` (HTTP/JSON), `hbase-thrift` (Thrift RPC)

**Coprocessors** are HBase's server-side extension framework. They allow custom
code to run inside RegionServer and Master processes, with the same privileges
as the host process. The base `Coprocessor` interface lives in `hbase-client`;
observer and endpoint interfaces (`RegionObserver`, `MasterObserver`, etc.) live
in `hbase-server`. Endpoint implementations live in `hbase-endpoint`. The
built-in `AccessController` coprocessor enforces ACLs; `VisibilityController`
enforces cell-level visibility labels. Third-party coprocessors are loaded via
configuration or table schema.

**Server subsystems** (separated from hbase-server for modularity):
`hbase-balancer`, `hbase-procedure`, `hbase-replication`, `hbase-asyncfs`,
`hbase-zookeeper`, `hbase-http`

**Shared libraries:**
`hbase-common`, `hbase-metrics` + `hbase-metrics-api`, `hbase-logging`,
`hbase-hadoop-compat`

**Extensions:**
`hbase-extensions` (currently `hbase-openssl` for native TLS support)

**Storage codecs:**
`hbase-compression/*` (pluggable algorithms), `hbase-external-blockcache`

**Packaging and shading:**
`hbase-shaded/*`, `hbase-assembly*`, `hbase-resource-bundle`

**Tooling:**
`hbase-shell` (JRuby REPL), `hbase-hbtop`, `hbase-mapreduce`, `hbase-backup`,
`hbase-diagnostics`

**Build infrastructure** (ignore for code tasks):
`hbase-build-configuration`, `hbase-checkstyle`, `hbase-annotations`,
`hbase-archetypes/*`, `hbase-dev-generate-classpath`

**Testing:**
`hbase-testing-util`, `hbase-it`, `hbase-examples`

### Navigating with @InterfaceAudience

Classes are annotated with `@InterfaceAudience` to indicate their intended
consumer:

- `Public` -- Stable client API. External consumers depend on these.
- `LimitedPrivate` -- Internal API shared across modules, scoped to a named
audience (e.g., `COPROC`, `CONFIG`, `REPLICATION`, `AUTHENTICATION`). The
audience name tells you who is expected to call this code.
- `Private` -- Module-internal. Not API.

These annotations are the fastest way to determine whether a class is part of
the external surface or internal plumbing.

### Key Entry Points

When investigating a behavior, start from where it enters the system:

- **Client RPCs**: `RSRpcServices` (RegionServer) and `MasterRpcServices`
(Master) handle all client-initiated RPCs. Trace from the method matching
the RPC name.
- **REST gateway**: resource classes in `hbase-rest` map HTTP verbs to
operations.
- **Thrift gateway**: handler classes in `hbase-thrift` map Thrift methods.
- **Coprocessor hooks**: observer interfaces (`RegionObserver`,
`MasterObserver`, etc.) define extension points. Implementations are loaded
via configuration or table schema.
- **Procedures**: `hbase-procedure` defines the framework; concrete procedures
(table create, region split, etc.) live in `hbase-server`.
- **Configuration**: properties are defined in `hbase-default.xml` (in
`hbase-common`) and overridden by operators in `hbase-site.xml`.
- **Wire format**: `.proto` files in `hbase-protocol-shaded` define every RPC
request/response and all persisted data structures. (Older branches had a
separate `hbase-protocol` module; it has been removed on master.)

### Split Packages

The same Java package often appears in multiple modules (e.g., the
`coprocessor` package exists in `hbase-client`, `hbase-server`,
`hbase-endpoint`, and `hbase-examples`). Each module contributes different
classes to the package. When searching for a class, check which module it
lives in -- the module determines the execution context.

### Related Repositories

[hbase-thirdparty](https://github.com/apache/hbase-thirdparty) is a companion
project that patches and shades key dependencies (protobuf, netty, gson, etc.)
so that HBase's internal use of these libraries does not conflict with
versions on the application classpath. The `hbase-shaded-*` artifacts from
that repo appear as dependencies throughout this project's `pom.xml`. Changes
to shaded dependency versions or patches happen in that repo, not here.

### Developer Tooling

`dev-support/` contains CI configuration, release automation, code analysis
scripts, and other maintainer tools. PR-level CI has migrated to GitHub
Actions (`.github/workflows/`), but nightly and branch-level CI still runs
via configurations in `dev-support/`. That directory also holds release
scripts, docker-based test environments, and various developer utilities.
See `dev-support/README.md` for a full index.

`conf/` holds default configuration templates (`hbase-site.xml`,
`hbase-env.sh`, `log4j2.properties`). `bin/` holds shell scripts for cluster
lifecycle and operations.

`dev-support/design-docs/` collects design documents and proposals for major
features. These capture the rationale behind complex subsystems and are useful
for understanding why the code is structured the way it is.

### Conventions

- Tests mirror source paths: `src/test/java` parallels `src/main/java`
- Generated code (protobuf, etc.) lives in `target/` and is not checked in
- Configuration properties use `hbase.` prefix
- The shell is JRuby wrapping the Java client API

## Documentation

The project website (https://hbase.apache.org) is maintained in this repo under
`hbase-website/`. User-facing and administrator-facing documentation covering
configuration, security, architecture, schema design, operations, APIs, and
more lives in `hbase-website/app/pages/_docs/docs/_mdx/`. The table of
contents and page ordering is defined in the `meta.json` files within that
tree.

The site also serves https://hbase.apache.org/llms-full.txt, which
concatenates all documentation pages into a single text file suitable for
LLM context ingestion.

## Security Model

The project's security model is documented at
`hbase-website/app/pages/_landing/security-model/content.md`
(published at https://hbase.apache.org/security-model).
Read that document for the full security model including trust boundaries,
what constitutes a valid vulnerability, and what does not.

When performing security analysis of this codebase, use the navigation
structure above to determine the role of the code under review, then apply
the security model to interpret findings in context.
84 changes: 84 additions & 0 deletions dev-support/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# dev-support

Developer and maintainer tooling for the HBase project. This directory
contains CI configuration, release automation, code analysis, and various
utility scripts.

## CI

PR-level CI runs via GitHub Actions (see `../.github/workflows/`). Nightly
builds, branch validation, and precommit checks still use Jenkins
configurations in this directory:

- `Jenkinsfile`, `Jenkinsfile_GitHub` -- Pipeline definitions
- `hbase_nightly_yetus.sh`, `jenkins_precommit_github_yetus.sh` -- Yetus-based
precommit and nightly check scripts
- `hbase-personality.sh` -- Yetus personality plugin that customizes checks for
HBase
- `jenkinsEnv.sh`, `jenkins-scripts/` -- Shared Jenkins environment setup
- `HOW_TO_YETUS_LOCAL.md` -- Guide for running Yetus checks locally

## Release Automation

- `create-release/` -- Docker-based release candidate builder (tags, builds,
signs, publishes). Entry point is `do-release-docker.sh`.
- `make_rc.sh` -- Older release candidate script (superseded by
`create-release/`)
- `hbase-vote.sh` -- Generates release vote email content
- `git-jira-release-audit/` -- Audits git history against JIRA fixVersion
fields to find discrepancies between what was committed and what JIRA says
shipped

## Code Quality and Analysis

- `checkcompatibility.py` -- Checks API/ABI compatibility between versions
- `checkstyle_report.py` -- Generates checkstyle reports
- `spotbugs-exclude.xml` -- SpotBugs exclusion rules
- `code-coverage/` -- Scripts for generating code coverage reports
- `flaky-tests/` -- Flaky test detection, reporting, and dashboards
- `license-header` -- Apache License header template

## Docker and Test Environments

- `docker/` -- Dockerfile for CI build environment
- `hbase_docker/`, `hbase_docker.sh` -- Docker-based local test cluster
- `adhoc_run_tests/` -- Scripts for running test suites outside CI
- `integration-test/` -- Integration test support

## Utility Scripts

- `smart-apply-patch.sh`, `make_patch.sh` -- Patch creation and application
- `rebase_all_git_branches.sh` -- Rebases all local tracking branches
- `zombie-detector.sh` -- Detects leaked processes from test runs
- `gather_machine_environment.sh` -- Captures build machine info for debugging
- `gh_hide_old_comments.sh` -- Hides outdated bot comments on GitHub PRs

## IDE Configuration

- `hbase_eclipse_formatter.xml` -- Eclipse code formatter settings
- `eclipse.importorder` -- Eclipse import ordering
- `HBase Code Template.xml` -- IntelliJ code template

## Design Documents

`design-docs/` collects design documents and proposals for major features.
These capture the rationale behind complex subsystems and are useful for
understanding why the code is structured the way it is.
4 changes: 2 additions & 2 deletions dev-support/code-coverage/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ the related modules.

Here is how you can generate the code coverage report:

```sh dev/code-coverage/run-coverage.sh```
```sh dev-support/code-coverage/run-coverage.sh```

## Publishing coverage results to SonarQube

Expand All @@ -45,5 +45,5 @@ The project name is an optional parameter.

Here is an example command for running and publishing the coverage data:

`./dev/code-coverage/run-coverage.sh -l ProjectCredentials
`./dev-support/code-coverage/run-coverage.sh -l ProjectCredentials
-u https://exampleserver.com -k Project_Key -n Project_Name`
2 changes: 1 addition & 1 deletion dev-support/hbase_docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,5 +42,5 @@ this image will start the HMaster and launch the HBase shell when run.
bash` to start a container without a running HMaster. Within this environment,
HBase is built in `/root/hbase-bin`.

> NOTE: When running on mac m1 platforms, the docker file requires setting platfrom flag explicitly.
> NOTE: When running on mac m1 platforms, the docker file requires setting platform flag explicitly.
> You may use same instructions above running from to the "./m1" sub-dir.
6 changes: 3 additions & 3 deletions dev-support/release-vm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ This is a vagrant project that provides a virtual machine environment suitable
for running an Apache HBase release.

Requires:
* [VirtualBox](http://virtualbox.org)
* [Vagrant](http://virtualbox.org)
* The private portion of your signing key avilable in the local GPG agent
* [VirtualBox](https://www.virtualbox.org/)
* [Vagrant](https://www.vagrantup.com/)
* The private portion of your signing key available in the local GPG agent
* The private portion of your Github authentication key available in either the local GPG agent or
local SSH agent

Expand Down
10 changes: 5 additions & 5 deletions hbase-archetypes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ exemplar Maven project containing:
For example, the components of the hbase-client-project consist of (a) sample
code `./src/main/.../HelloHBase.java` and `./src/test/.../TestHelloHBase.java`,
(b) a `pom.xml` file establishing dependency upon hbase-client and test-scope
dependency upon hbase-testing-util, and (c) a `log4j.properties` resource file.
dependency upon hbase-testing-util, and (c) a `log4j2.properties` resource file.

#### How archetypes are created during the hbase install process
During the `mvn install` process, all standalone exemplar projects in the
Expand Down Expand Up @@ -108,19 +108,19 @@ to choose from for generation of a new Maven project.

## Footnotes:
<b id="f1">1</b> -- [Maven Archetype
](http://maven.apache.org/archetype/index.html) ("About" page).
](https://maven.apache.org/archetype/index.html) ("About" page).
-- [↩](#a1)

<b id="f2">2</b> -- [Maven Archetype Catalog
](http://repo1.maven.org/maven2/archetype-catalog.xml) (4MB+ xml file).
](https://repo1.maven.org/maven2/archetype-catalog.xml) (4MB+ xml file).
-- [↩](#a2)

<b id="f3">3</b> -- [Maven Central Repository](http://search.maven.org/)
<b id="f3">3</b> -- [Maven Central Repository](https://search.maven.org/)
(search engine).
-- [↩](#a3)

<b id="f4">4</b> -- [Maven Archetype Plugin - archetype:generate
](http://maven.apache.org/archetype/maven-archetype-plugin/generate-mojo.html).
](https://maven.apache.org/archetype/maven-archetype-plugin/generate-mojo.html).
-- [↩](#a4)

<b id="f5">5</b> -- Prior to archetype creation, each exemplar project's
Expand Down
25 changes: 13 additions & 12 deletions hbase-endpoint/README.txt
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
ON PROTOBUFS
This maven module has protobuf definition files ('.protos') used by hbase
Coprocessor Endpoints that ship with hbase core (including tests). Coprocessor
Endpoints are meant to be standalone, independent code not reliant on hbase
internals. They define their Service using protobuf. The protobuf version
they use can be distinct from that used by HBase internally since HBase started
shading its protobuf references. Endpoints have no access to the shaded protobuf
hbase uses. They do have access to the content of hbase-protocol -- the
.protos found in this module -- but avoid using as much of this as you can as it is
liable to change.
This module contains coprocessor endpoint implementations that ship with
HBase core. Coprocessor endpoints are standalone RPC services deployed on
region servers (or master) via the coprocessor framework.

Generation of java files from protobuf .proto files included here is done as
part of the build.
Included endpoints:
- AggregateImplementation -- server-side aggregation (sum, min, max, avg, etc.)
- Export -- server-side table export to HDFS

Client helpers for invoking these endpoints are in the
org.apache.hadoop.hbase.client.coprocessor package (AggregationClient,
AsyncAggregationClient).

The protobuf service definitions used by these endpoints live in
hbase-protocol-shaded, not in this module.
4 changes: 2 additions & 2 deletions hbase-examples/README.txt
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ Example code.
<server-principal> should only be specified when the client connects to a secure cluster. It's default value is "hbase".
4. Here is a lazy example that just pulls in all hbase dependency jars and that goes against default location on localhost.
It should work with a standalone hbase instance started by doing ./bin/start-hbase.sh:
{java -cp ./hbase-examples/target/hbase-examples-2.0.0-SNAPSHOT.jar:`./bin/hbase classpath` org.apache.hadoop.hbase.thrift.DemoClient localhost 9090}
{java -cp ./hbase-examples/target/hbase-examples-4.0.0-alpha-1-SNAPSHOT.jar:`./bin/hbase classpath` org.apache.hadoop.hbase.thrift.DemoClient localhost 9090}

* Ruby: hbase-examples/src/main/ruby/DemoClient.rb
1. Modify the import path in the file to point to {$THRIFT_HOME}/lib/rb/lib.
Expand All @@ -64,7 +64,7 @@ Example code.

* CPP: hbase-examples/src/main/cpp/DemoClient.cpp
1. Make sure you have Thrift C++ libraries; modify Makefile if necessary.
The recent (0.14.1 as of this writing) version of Thrift can be downloaded from http://thrift.apache.org/download/.
The recent (0.14.1 as of this writing) version of Thrift can be downloaded from https://thrift.apache.org/download/.
2. Execute {make}.
3. Execute {./DemoClient <host> <port>}.

Expand Down
Loading
Loading