From 80c71ac69ca966d8ec478f5d708e76869a19a74b Mon Sep 17 00:00:00 2001 From: Daniel Szoke Date: Fri, 22 May 2026 13:59:48 +0200 Subject: [PATCH 1/3] meta(commit): Add commit message helpers Add a `commit-msg` `pre-commit` hook that expands GitHub issue footers into markdown links and appends a matching Linear footer when GitHub comments include a linked Linear issue. Running this as a hook ensures footer expansion happens for both agent-written and manually-written commit messages. Add a commit agent skill that fetches Sentry commit guidelines before creating or amending commits and documents the footer format expected by the hook. I have been using a similar [commit skill](https://github.com/szokeasaurusrex/pi-agent/blob/85d169d0022f31c46cf7821e2f176bd739b1b2a2/skills/commit/SKILL.md) locally; this checks it into this repo with modifications for the new `pre-commit` hook. ### Examples A footer for [getsentry/sentry-rust#1](https://github.com/getsentry/sentry-rust/issues/1): ``` References #1 ``` becomes: ``` References [#1](https://github.com/getsentry/sentry-rust/issues/1) ``` A footer for [getsentry/sentry-rust#1130](https://github.com/getsentry/sentry-rust/issues/1130), which has a Linear linkback to [RUST-216](https://linear.app/getsentry/issue/RUST-216): ``` References #1130 ``` becomes: ``` References [#1130](https://github.com/getsentry/sentry-rust/issues/1130) References [RUST-216](https://linear.app/getsentry/issue/RUST-216) ``` --- .agents/skills/commit/SKILL.md | 55 ++++ .../commit/scripts/fetch-commit-guidelines.sh | 5 + .pre-commit-config.yaml | 8 + scripts/commit-msg-expand-issues.py | 244 ++++++++++++++++++ 4 files changed, 312 insertions(+) create mode 100644 .agents/skills/commit/SKILL.md create mode 100755 .agents/skills/commit/scripts/fetch-commit-guidelines.sh create mode 100644 .pre-commit-config.yaml create mode 100755 scripts/commit-msg-expand-issues.py diff --git a/.agents/skills/commit/SKILL.md b/.agents/skills/commit/SKILL.md new file mode 100644 index 000000000..2e9da3ee1 --- /dev/null +++ b/.agents/skills/commit/SKILL.md @@ -0,0 +1,55 @@ +--- +name: commit +description: Use this skill when asked to create or amend a commit. +--- + +# Commit + +Use this skill whenever creating or amending a commit. + +## 1) Fetch and follow official commit guidelines + +Run: + +```bash +scripts/fetch-commit-guidelines.sh +``` + +Use that output as the source of truth for commit format/rules. + +**Exception:** Do not **manually wrap lines** or **enforce maximum line length**, ignore any instructions to the contrary. + +## 2) Write the commit body for maintainers + +Commit messages are reused as PR descriptions. Therefore, write commit messages keeping in mind that the primary audiences are human code reviewers and future maintainers. Optimize for skimmability while retaining sufficient context around changes, but do not repeat context that is easily inferred from the changes themselves, linked issues, or background information that mainters with at least a basic familiarity of the codebase would possess. + +Some tips: +- include brief context for why the change is needed +- include why this approach was chosen (when relevant) +- include links to relevant sources/issues/docs when useful +- be concise, human, and specific +- assume reviewers will skim the linked issue; do not restate it in depth + +Commit messages use Markdown formatting. For example, use backticks for technical literals, inline links for URLs, and lists where useful. + +When committing, you should use heredoc format to preserve newlines and other formatting. + +## 3) Append Commit Footer + +If a commit is related to a GitHub issue, this must be noted in a footer. + +These footers must be placed on their own lines. The footer looks like the following: + +``` +[keyword] #[issue-id] +``` + +When the issue is in a different repo, use `[keyword] [repo]#[issue-id]` or, if the repo belongs to a different owner, `[keyword] [owner]/[repo]#[issue-id]`. + +The keywords "Closes", "Fixes" and "Resolves" indicate that the commit fully addresses the issue. Merging a pull request containing such a commit will close the referenced issue. + +The keywords "References", "Related to", and "Contributes to" may be used to indicate a relation to the issue, when the issue is not fully addressed by the commit. The issue will not be auto-closed upon merge. + +One commit may contain zero or more footers; make sure all related issues you are aware of have a corresponding footer. + +A pre-commit hook will take care of linking Linear issues, where applicable. Do not manually add these links, or use any format other than what is described here. You need to follow this precise format so that the pre-commit hook can work properly. diff --git a/.agents/skills/commit/scripts/fetch-commit-guidelines.sh b/.agents/skills/commit/scripts/fetch-commit-guidelines.sh new file mode 100755 index 000000000..ccc2026f8 --- /dev/null +++ b/.agents/skills/commit/scripts/fetch-commit-guidelines.sh @@ -0,0 +1,5 @@ +#!/usr/bin/env bash +set -euo pipefail + +URL="https://develop.sentry.dev/engineering-practices/commit-messages.md" +curl -fsSL "$URL" diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml new file mode 100644 index 000000000..784969acd --- /dev/null +++ b/.pre-commit-config.yaml @@ -0,0 +1,8 @@ +repos: + - repo: local + hooks: + - id: expand-github-linear-footer + name: Expand GitHub/Linear commit footer + entry: scripts/commit-msg-expand-issues.py + language: script + stages: [commit-msg] diff --git a/scripts/commit-msg-expand-issues.py b/scripts/commit-msg-expand-issues.py new file mode 100755 index 000000000..abcc0db2d --- /dev/null +++ b/scripts/commit-msg-expand-issues.py @@ -0,0 +1,244 @@ +#!/usr/bin/env python3 +"""Expand GitHub issue commit footers and add Linear footers when available.""" + +from __future__ import annotations + +import json +import re +import subprocess +import sys +from dataclasses import dataclass +from pathlib import Path +from typing import Any + +FOOTER_RE = re.compile( + r"^(?P\s*)(?P\w+)\s+" + r"(?P(?:(?P[A-Za-z0-9_.-]+)/)?(?:(?P[A-Za-z0-9_.-]+))?#(?P[1-9][0-9]*))" + r"(?P\s*)$" +) +LINEAR_LINKBACK_AUTHORS = {"linear", "linear-code"} +LINEAR_LINKBACK_MARKERS = ("linear-linkback", "linear linkback") + +LINEAR_URL_RE = re.compile( + r"(?Phttps://linear\.app/[^\s<>)\]\"']*/issue/(?P[^/\s<>)\]\"']+)[^\s<>)\]\"']*)" +) + + +@dataclass(frozen=True) +class Match: + line_index: int + prefix: str + keyword: str + display: str + owner: str | None + repo: str | None + issue: str + suffix: str + + +@dataclass(frozen=True) +class IssueInfo: + url: str + linear_id: str | None = None + linear_url: str | None = None + + +def warn(message: str) -> None: + print(f"commit-msg-expand-issues: warning: {message}", file=sys.stderr) + + +def run_gh(args: list[str]) -> tuple[dict[str, Any] | None, str | None]: + try: + result = subprocess.run( + ["gh", *args], + check=False, + capture_output=True, + encoding="utf-8", + ) + except FileNotFoundError: + return None, "gh was not found" + except OSError as exc: + return None, f"failed to run gh: {exc}" + + if result.returncode != 0: + detail = (result.stderr or result.stdout).strip() + return None, detail or f"gh exited with status {result.returncode}" + + try: + return json.loads(result.stdout), None + except json.JSONDecodeError as exc: + return None, f"failed to parse gh output: {exc}" + + +def run_gh_text(args: list[str]) -> tuple[str | None, str | None]: + try: + result = subprocess.run( + ["gh", *args], + check=False, + capture_output=True, + encoding="utf-8", + ) + except FileNotFoundError: + return None, "gh was not found" + except OSError as exc: + return None, f"failed to run gh: {exc}" + + if result.returncode != 0: + detail = (result.stderr or result.stdout).strip() + return None, detail or f"gh exited with status {result.returncode}" + return result.stdout.strip(), None + + +def current_repo() -> tuple[str, str] | None: + name_with_owner, error = run_gh_text( + ["repo", "view", "--json", "nameWithOwner", "-q", ".nameWithOwner"] + ) + if error is not None: + warn(f"could not resolve current repository: {error}") + return None + + if not name_with_owner or "/" not in name_with_owner: + warn("could not resolve current repository: unexpected gh output") + return None + + owner, repo = name_with_owner.split("/", 1) + return owner, repo + + +def is_linear_linkback_comment(comment: dict[str, Any]) -> bool: + author = comment.get("author") + if not isinstance(author, dict) or author.get("login") not in LINEAR_LINKBACK_AUTHORS: + return False + + body = comment.get("body") + if not isinstance(body, str): + return False + + normalized_body = body.lower() + return any(marker in normalized_body for marker in LINEAR_LINKBACK_MARKERS) + + +def find_linear_link(issue: dict[str, Any]) -> tuple[str, str] | None: + comments = issue.get("comments") or [] + if not isinstance(comments, list): + return None + + for comment in comments: + if not isinstance(comment, dict) or not is_linear_linkback_comment(comment): + continue + + body = comment["body"] + url_match = LINEAR_URL_RE.search(body) + if not url_match: + continue + + return url_match.group("id"), url_match.group("url") + return None + + +def fetch_issue(owner_repo: str, issue_number: str) -> IssueInfo | None: + result, error = run_gh( + ["issue", "view", issue_number, "-R", owner_repo, "--json", "number,url,comments"] + ) + if error is not None: + warn(f"could not fetch {owner_repo}#{issue_number}: {error}") + return None + if not isinstance(result, dict) or not isinstance(result.get("url"), str): + warn(f"could not fetch {owner_repo}#{issue_number}: unexpected gh output") + return None + + linear = find_linear_link(result) + if linear is None: + return IssueInfo(url=result["url"]) + linear_id, linear_url = linear + return IssueInfo(url=result["url"], linear_id=linear_id, linear_url=linear_url) + + +def collect_matches(lines: list[str]) -> list[Match]: + matches: list[Match] = [] + for index, line in enumerate(lines): + stripped_newline = line.removesuffix("\n") + match = FOOTER_RE.match(stripped_newline) + if match is None: + continue + matches.append( + Match( + line_index=index, + prefix=match.group("prefix"), + keyword=match.group("keyword"), + display=match.group("display"), + owner=match.group("owner"), + repo=match.group("repo"), + issue=match.group("issue"), + suffix=match.group("suffix"), + ) + ) + return matches + + +def resolve_owner_repo(match: Match, current_owner: str, current_repo_name: str) -> str: + if match.owner is not None and match.repo is not None: + return f"{match.owner}/{match.repo}" + if match.repo is not None: + return f"{current_owner}/{match.repo}" + return f"{current_owner}/{current_repo_name}" + + +def process_message(path: Path) -> None: + try: + lines = path.read_text(encoding="utf-8").splitlines(keepends=True) + except OSError as exc: + warn(f"could not read commit message: {exc}") + return + + matches = collect_matches(lines) + if not matches: + return + + repo = current_repo() + if repo is None: + return + current_owner, current_repo_name = repo + + issue_cache: dict[tuple[str, str], IssueInfo | None] = {} + replacements: dict[int, str] = {} + + for match in matches: + owner_repo = resolve_owner_repo(match, current_owner, current_repo_name) + key = (owner_repo, match.issue) + if key not in issue_cache: + issue_cache[key] = fetch_issue(owner_repo, match.issue) + + issue = issue_cache[key] + if issue is None: + continue + + replacement = f"{match.prefix}{match.keyword} [{match.display}]({issue.url}){match.suffix}\n" + if issue.linear_id is not None and issue.linear_url is not None: + next_line = lines[match.line_index + 1] if match.line_index + 1 < len(lines) else "" + linear_line = f"{match.prefix}{match.keyword} [{issue.linear_id}]({issue.linear_url})\n" + if next_line != linear_line: + replacement += linear_line + replacements[match.line_index] = replacement + + if not replacements: + return + + new_lines = [replacements.get(index, line) for index, line in enumerate(lines)] + try: + path.write_text("".join(new_lines), encoding="utf-8") + except OSError as exc: + warn(f"could not write commit message: {exc}") + + +def main(argv: list[str]) -> int: + if len(argv) != 2: + warn("expected exactly one commit message file path") + return 0 + + process_message(Path(argv[1])) + return 0 + + +if __name__ == "__main__": + raise SystemExit(main(sys.argv)) From 9ba3dd370c81c6366007f2e73f9a3fc88d9118db Mon Sep 17 00:00:00 2001 From: Daniel Szoke Date: Fri, 22 May 2026 14:08:04 +0200 Subject: [PATCH 2/3] feat(client-reports): Client report protocol Added WIP client report protocol. Resolves [#1001](https://github.com/getsentry/sentry-rust/issues/1001) Resolves [RUST-153](https://linear.app/getsentry/issue/RUST-153/add-client-report-protocol-envelope-item-support-in-sentry-types) --- .../src/protocol/client_report/list.rs | 66 +++++++++++++++++++ .../src/protocol/client_report/mod.rs | 60 +++++++++++++++++ sentry-types/src/protocol/mod.rs | 1 + sentry-types/src/protocol/v7.rs | 1 + 4 files changed, 128 insertions(+) create mode 100644 sentry-types/src/protocol/client_report/list.rs create mode 100644 sentry-types/src/protocol/client_report/mod.rs diff --git a/sentry-types/src/protocol/client_report/list.rs b/sentry-types/src/protocol/client_report/list.rs new file mode 100644 index 000000000..a8ca3112b --- /dev/null +++ b/sentry-types/src/protocol/client_report/list.rs @@ -0,0 +1,66 @@ +//! Module with code for representing the underlying list of client reports. + +use std::collections::HashMap; + +use serde::ser::SerializeSeq as _; +use serde::{Serialize, Serializer}; + +use super::{DataCategory, DiscardReason}; + +#[derive(Debug)] +pub(super) struct ClientReportList(HashMap); + +#[derive(Debug, Serialize)] +struct ClientReportItem { + #[serde(flatten)] + reason_category: ReasonCategory, + quantity: u64, +} + +/// A reason/category pair. Used to key the discarded events. +#[derive(Debug, Serialize, PartialEq, Eq, Hash, Clone, Copy)] +struct ReasonCategory { + reason: DiscardReason, + category: DataCategory, +} + +impl ClientReportList { + /// Insert an item into the list. + /// + /// Records `quantity` discarded events in the given data `category` for the given discard + /// `reason`. If there is already a record for that (`category`, `reason`) pair, we increment + /// the quantity of the existing pair, accordingly. + pub(super) fn add(&mut self, category: DataCategory, reason: DiscardReason, quantity: u64) { + let reason_category = ReasonCategory { category, reason }; + let val = self.0.entry(reason_category).or_default(); + *val = val.saturating_add(quantity); + } + + fn iter(&self) -> impl Iterator + '_ { + self.0 + .iter() + .map(|(&reason_category, &quantity)| ClientReportItem { + reason_category, + quantity, + }) + } + + fn len(&self) -> usize { + self.0.len() + } +} + +impl Serialize for ClientReportList { + fn serialize(&self, serializer: S) -> Result + where + S: Serializer, + { + let seq = serializer.serialize_seq(Some(self.len()))?; + + self.iter() + .try_fold(seq, |mut seq, item| { + seq.serialize_element(&item).map(|()| seq) + })? + .end() + } +} diff --git a/sentry-types/src/protocol/client_report/mod.rs b/sentry-types/src/protocol/client_report/mod.rs new file mode 100644 index 000000000..737e66386 --- /dev/null +++ b/sentry-types/src/protocol/client_report/mod.rs @@ -0,0 +1,60 @@ +//! Module containing types related to [Client Reports]. +//! +//! [Client Reports]: https://develop.sentry.dev/sdk/telemetry/client-reports/ + +use std::time::SystemTime; + +use serde::Serialize; + +use self::list::ClientReportList; +use crate::utils; + +mod list; + +/// A [client report]. +/// +/// [client report]: https://develop.sentry.dev/sdk/telemetry/client-reports/ +#[derive(Debug, Serialize)] +pub struct ClientReport { + #[serde(with = "utils::ts_seconds_float")] + timestamp: SystemTime, + discarded_events: ClientReportList, +} + +impl ClientReport { + /// Insert an item into the `discarded_events` list. + /// + /// Records `quantity` discarded events in the given data `category` for the given discard + /// `reason`. If there is already a record for that (`category`, `reason`) pair, we increment + /// the quantity of the existing pair, accordingly. + pub fn add_discarded_event( + &mut self, + category: DataCategory, + reason: DiscardReason, + quantity: u64, + ) { + self.discarded_events.add(category, reason, quantity); + } +} + +/// The reason why a telemetry item was discarded. +/// +/// Valid discard reasons are listed in the [develop docs]; this enum may only define a subset of +/// these data categories, but we will add further categories as we begin using them in the SDK. +/// +/// [develop docs]: https://develop.sentry.dev/sdk/telemetry/client-reports/#discard-reasons-1 +#[derive(Debug, Serialize, PartialEq, Eq, Hash, Clone, Copy)] +#[serde(rename_all = "snake_case")] +#[non_exhaustive] +pub enum DiscardReason {} + +/// The category of data which was dropped. +/// +/// Valid categories are listed in the [develop docs]; this enum may only define a subset of these +/// valid data categories, but we will add further categories as we begin using them in the SDK. +/// +/// [develop docs]: https://develop.sentry.dev/sdk/foundations/transport/rate-limiting/#definitions +#[derive(Debug, Serialize, PartialEq, Eq, Hash, Clone, Copy)] +#[serde(rename_all = "snake_case")] +#[non_exhaustive] +pub enum DataCategory {} diff --git a/sentry-types/src/protocol/mod.rs b/sentry-types/src/protocol/mod.rs index ec80a4db6..7639466c1 100644 --- a/sentry-types/src/protocol/mod.rs +++ b/sentry-types/src/protocol/mod.rs @@ -14,6 +14,7 @@ pub const LATEST: u16 = 7; pub use v7 as latest; mod attachment; +mod client_report; mod envelope; mod monitor; mod session; diff --git a/sentry-types/src/protocol/v7.rs b/sentry-types/src/protocol/v7.rs index c952ee2be..4d1099a52 100644 --- a/sentry-types/src/protocol/v7.rs +++ b/sentry-types/src/protocol/v7.rs @@ -25,6 +25,7 @@ pub use uuid::Uuid; use crate::utils::{display_from_str_opt, ts_rfc3339_opt, ts_seconds_float}; pub use super::attachment::*; +pub use super::client_report::ClientReport; pub use super::envelope::*; pub use super::monitor::*; pub use super::session::*; From 4b691e4c58280e62f2cf4d0fec16765873d7fa73 Mon Sep 17 00:00:00 2001 From: Daniel Szoke Date: Tue, 26 May 2026 18:47:49 +0200 Subject: [PATCH 3/3] fixup! feat(client-reports): Client report protocol --- sentry-types/src/macros.rs | 65 +++++++++++++++++++ .../src/protocol/client_report/mod.rs | 42 ++++++------ 2 files changed, 87 insertions(+), 20 deletions(-) diff --git a/sentry-types/src/macros.rs b/sentry-types/src/macros.rs index d59c4825d..1c0e1c209 100644 --- a/sentry-types/src/macros.rs +++ b/sentry-types/src/macros.rs @@ -246,3 +246,68 @@ mod hex_tests { ); } } + +/// A macro which can wrap any number of enum definitions to make them "indexed." +/// +/// Specifically, the macro adds an implementation to each enum, which contains the following: +/// - `const VARIANT_COUNT`: the total number of variants in the enum. +/// - `const fn as_index(&self)`: the unique zero-based index of the enum variant. +/// Both of these items have the same visibility as the enum itself. +/// +/// This is super useful, for example, if you want to store something for each variant. Rather +/// than using a `HashMap`, it is possible to allocate a fixed-length array of length +/// `VARIANT_COUNT`, indexed by `as_index`. +macro_rules! indexed_enum { + () => {}; + + { + $(#[$meta:meta])* + $vis:vis enum $name:ident { + $( + $(#[$variant_meta:meta])* + $variant:ident + ),* $(,)? + } + $($rest:tt)* + } => { + $(#[$meta])* + $vis enum $name { + $( + $(#[$variant_meta])* + $variant, + )* + } + + impl $name { + /// The number of variants in this enum. + $vis const VARIANT_COUNT: usize = indexed_enum!(@count $($variant),*); + + /// Returns this variant's unique zero-based index. + /// + /// The index satisfies `0 <= self.as_index() < Self::VARIANT_COUNT`. + $vis const fn as_index(&self) -> usize { + indexed_enum!(@match_stmt *self; [] 0usize; $($variant),*) + } + } + + indexed_enum! { + $($rest)* + } + }; + + (@match_stmt $value:expr; [$($arms:tt)*] $idx:expr;) => { + match $value { + $($arms)* + } + }; + + (@match_stmt $value:expr; [$($arms:tt)*] $idx:expr; $variant:ident $(, $rest:ident)*) => { + indexed_enum!(@match_stmt $value; [$($arms)* Self::$variant => $idx,] $idx + 1usize; $($rest),*) + }; + + (@count) => { 0usize }; + + (@count $variant:ident $(, $rest:ident)*) => { + 1usize + indexed_enum!(@count $($rest),*) + }; +} diff --git a/sentry-types/src/protocol/client_report/mod.rs b/sentry-types/src/protocol/client_report/mod.rs index 737e66386..50b54821f 100644 --- a/sentry-types/src/protocol/client_report/mod.rs +++ b/sentry-types/src/protocol/client_report/mod.rs @@ -37,24 +37,26 @@ impl ClientReport { } } -/// The reason why a telemetry item was discarded. -/// -/// Valid discard reasons are listed in the [develop docs]; this enum may only define a subset of -/// these data categories, but we will add further categories as we begin using them in the SDK. -/// -/// [develop docs]: https://develop.sentry.dev/sdk/telemetry/client-reports/#discard-reasons-1 -#[derive(Debug, Serialize, PartialEq, Eq, Hash, Clone, Copy)] -#[serde(rename_all = "snake_case")] -#[non_exhaustive] -pub enum DiscardReason {} +indexed_enum! { + /// The reason why a telemetry item was discarded. + /// + /// Valid discard reasons are listed in the [develop docs]; this enum may only define a subset of + /// these data categories, but we will add further categories as we begin using them in the SDK. + /// + /// [develop docs]: https://develop.sentry.dev/sdk/telemetry/client-reports/#discard-reasons-1 + #[derive(Debug, Serialize, PartialEq, Eq, Hash, Clone, Copy)] + #[serde(rename_all = "snake_case")] + #[non_exhaustive] + pub enum DiscardReason {} -/// The category of data which was dropped. -/// -/// Valid categories are listed in the [develop docs]; this enum may only define a subset of these -/// valid data categories, but we will add further categories as we begin using them in the SDK. -/// -/// [develop docs]: https://develop.sentry.dev/sdk/foundations/transport/rate-limiting/#definitions -#[derive(Debug, Serialize, PartialEq, Eq, Hash, Clone, Copy)] -#[serde(rename_all = "snake_case")] -#[non_exhaustive] -pub enum DataCategory {} + /// The category of data which was dropped. + /// + /// Valid categories are listed in the [develop docs]; this enum may only define a subset of these + /// valid data categories, but we will add further categories as we begin using them in the SDK. + /// + /// [develop docs]: https://develop.sentry.dev/sdk/foundations/transport/rate-limiting/#definitions + #[derive(Debug, Serialize, PartialEq, Eq, Hash, Clone, Copy)] + #[serde(rename_all = "snake_case")] + #[non_exhaustive] + pub enum DataCategory {} +}