feat(Sources): Generate Synthetic Git Repo#17
feat(Sources): Generate Synthetic Git Repo#17Tonisal-byte wants to merge 16 commits intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors component source preparation to support generating a synthetic Git history by preserving upstream .git directories when overlays are applied, and by deriving synthetic commits from project-repo commits annotated with Affects: <component-name>.
Changes:
- Added functional options to
FetchComponent/GetComponentto optionally preserve upstream.gitdirectories. - Refactored overlay application into ordered collection + application, followed by optional synthetic history generation.
- Introduced
synthistory.go(and tests) to findAffects:commits and create synthetic commits in the upstream repo.
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/providers/sourceproviders/sourceproviders_test/sourcemanager_mocks.go | Updates gomock stubs for the new variadic FetchComponent options. |
| internal/providers/sourceproviders/sourcemanager.go | Adds FetchComponentOption + wires options through SourceManager and providers. |
| internal/providers/sourceproviders/rpmcontentsprovider.go | Adapts RPM provider to the new GetComponent(...opts) signature (opts ignored). |
| internal/providers/sourceproviders/fedorasourceprovider.go | Implements .git preservation behavior based on resolved fetch options. |
| internal/projectconfig/configfile.go | Exposes config file path/dir via accessors needed for repo discovery. |
| internal/app/azldev/core/sources/synthistory.go | Adds synthetic history discovery/creation logic (Affects parsing, repo dirty check, commit creation). |
| internal/app/azldev/core/sources/synthistory_test.go | Adds unit tests for affects discovery, dirty detection, and synthetic commit creation. |
| internal/app/azldev/core/sources/sourceprep.go | Refactors overlay application flow and integrates synthetic history generation. |
| internal/app/azldev/core/sources/sourceprep_test.go | Updates tests/mocks for new FetchComponent(...opts) call shape. |
| internal/app/azldev/core/componentbuilder/componentbuilder_test.go | Updates builder tests for new variadic FetchComponent signature. |
| go.mod | Adds go-git dependencies and bumps indirect deps (otel/proto/protobuf/etc.). |
| go.sum | Records updated module sums for added/bumped dependencies. |
You can also share your feedback on Copilot code review. Take the survey.
|
todo: Also, after this is merged, we'll need to work on getting the release value correct - for %autorelease (and %autochangelog) pkgs it should 'just work', but for pkgs that still manually set the release number, this will have to increment it during building of the dist-git repo commits (and add changelog entries) And, we'll need to settle on how to handle intermediate/uncommitted package release values (i.e. if using the |
|
thought: we'll need some way to integrate this functionality with packages where the |
or maybe |
There was a problem hiding this comment.
Pull request overview
This PR refactors component source preparation to support generating a synthetic git history that layers “Affects: ” commits from the project repo on top of upstream sources, and adds a --no-git escape hatch for workflows that run outside a git checkout.
Changes:
- Introduces synthetic history utilities (
FindAffectsCommits,CommitSyntheticHistory) and integrates them into source preparation. - Extends
SourceManager/ upstream providers to optionally preserve the upstream.gitdirectory via functional options. - Adds
--no-gitto build/diff/prepare-sources commands and updates scenario artifacts + generated CLI docs.
Reviewed changes
Copilot reviewed 19 out of 20 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| scenario/internal/buildtest/buildtest.go | Scenario helper updated to pass --no-git during component build. |
| scenario/snapshots/TestMCPServerMode_1.snap.json | Snapshot updated for the new no-git flag. |
| internal/providers/sourceproviders/sourceproviders_test/sourcemanager_mocks.go | Mock updated for variadic FetchComponentOption. |
| internal/providers/sourceproviders/sourcemanager.go | Adds FetchComponentOption (e.g., preserve .git) and threads options through fetch paths. |
| internal/providers/sourceproviders/rpmcontentsprovider.go | Accepts fetch options (ignored) to satisfy interface. |
| internal/providers/sourceproviders/fedorasourceprovider.go | Preserves upstream .git directory when requested. |
| internal/projectconfig/configfile.go | Exposes config file source path/dir accessors used by synthetic history logic. |
| internal/app/azldev/core/sources/synthistory.go | New synthetic history implementation and project-repo discovery helpers. |
| internal/app/azldev/core/sources/synthistory_test.go | Tests for synthetic history helpers. |
| internal/app/azldev/core/sources/sourceprep_test.go | Updates mocks/signatures to match new interfaces. |
| internal/app/azldev/core/sources/sourceprep.go | Refactors overlay flow and integrates synthetic history generation + --no-git option. |
| internal/app/azldev/core/componentbuilder/componentbuilder_test.go | Updates mock FetchComponent signature usage. |
| internal/app/azldev/cmds/component/preparesources.go | Adds --no-git flag wiring into the source preparer. |
| internal/app/azldev/cmds/component/diffsources.go | Adds --no-git flag wiring into the source preparer. |
| internal/app/azldev/cmds/component/build.go | Adds --no-git flag wiring into the source preparer. |
| go.mod / go.sum | Adds go-git/go-billy and updates transitive dependencies. |
| docs/user/reference/cli/azldev_component_prepare-sources.md | Generated CLI docs updated for --no-git. |
| docs/user/reference/cli/azldev_component_diff-sources.md | Generated CLI docs updated for --no-git. |
| docs/user/reference/cli/azldev_component_build.md | Generated CLI docs updated for --no-git. |
|
Focusing this PR on just generating dist-git repos correctly. Will create a follow up PR that addresses the scenarios in which we could like include dev changes with |
| Author: &object.Signature{ | ||
| Name: "azldev", | ||
| Email: "azldev@microsoft.com", | ||
| When: time.Now().UTC(), |
There was a problem hiding this comment.
initSourcesRepo uses time.Now().UTC() for the initial commit timestamp. Because git commit hashes include timestamps, this makes the synthetic repo history non-deterministic across runs, which can undermine reproducibility and any downstream logic that keys off commit IDs. Consider using a deterministic timestamp (e.g., Unix epoch, or a timestamp derived from the upstream commit / first overlay commit) for the initial synthetic commit.
| When: time.Now().UTC(), | |
| // Use a deterministic timestamp so the initial synthetic commit hash is reproducible. | |
| When: time.Unix(0, 0).UTC(), |
| sourceManager, err := sourceproviders.NewSourceManager(env, distro) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to create source manager:\n%w", err) | ||
| } | ||
|
|
||
| preparer, err := sources.NewPreparer(sourceManager, env.FS(), env, env) | ||
| var preparerOpts []sources.PreparerOption | ||
| if options.NoGitRepo { | ||
| preparerOpts = append(preparerOpts, sources.WithNoGitRepo()) | ||
| } | ||
|
|
||
| preparer, err := sources.NewPreparer(sourceManager, env.FS(), env, env, preparerOpts...) | ||
| if err != nil { |
There was a problem hiding this comment.
--no-git / NoGitRepo is wired into sources.NewPreparer(...), but DiffSources never calls trySyntheticHistory (it only fetches sources and applies overlays). As a result, this flag currently has no behavioral effect for component diff-sources, which is confusing and makes the CLI surface area larger than necessary. Either remove the flag from diff-sources, or have diff-sources exercise the same synthetic-history path (with appropriate .git diff exclusions).
| // FindAffectsCommits walks the git log from HEAD and returns metadata for all commits | ||
| // whose message contains "Affects: <componentName>". Results are sorted chronologically | ||
| // (oldest first). | ||
| func FindAffectsCommits(repo *gogit.Repository, componentName string) ([]CommitMetadata, error) { |
There was a problem hiding this comment.
This PR introduces a new user workflow where project commits must include an Affects: <component-name> marker to be included in synthetic history. There doesn’t appear to be user-guide documentation explaining this convention (what it does, exact matching rules, and examples). Please add a short section under docs/user/ (likely a how-to or explanation page related to overlays/source preparation) describing how to use the Affects: marker and how it interacts with synthetic history generation.
|
|
||
| var matches []CommitMetadata | ||
|
|
||
| re := regexp.MustCompile(affectsRegexPattern + regexp.QuoteMeta(componentName) + `\b`) |
There was a problem hiding this comment.
FindAffectsCommits will match Affects: <component> as a prefix when the component name is followed by a non-word character (e.g. searching for curl matches a commit with Affects: curl-minimal because \b treats - as a boundary). This can cause commits intended for one component to be incorrectly applied to another. Consider requiring an exact component-name match (e.g., end-of-line or whitespace delimiter) rather than a word-boundary match.
| re := regexp.MustCompile(affectsRegexPattern + regexp.QuoteMeta(componentName) + `\b`) | |
| re := regexp.MustCompile(affectsRegexPattern + regexp.QuoteMeta(componentName) + `(?:$|\s|[,;])`) |
| // FindAffectsCommits walks the git log from HEAD and returns metadata for all commits | ||
| // whose message contains "Affects: <componentName>". Results are sorted chronologically | ||
| // (oldest first). | ||
| func FindAffectsCommits(repo *gogit.Repository, componentName string) ([]CommitMetadata, error) { | ||
| head, err := repo.Head() | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to get HEAD reference:\n%w", err) | ||
| } | ||
|
|
||
| commitIter, err := repo.Log(&gogit.LogOptions{From: head.Hash()}) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to iterate commit log:\n%w", err) | ||
| } |
There was a problem hiding this comment.
FindAffectsCommits walks the full project git history every time sources are prepared for a component. In multi-component builds, this becomes O(components × project-commits) work and can get expensive on large configuration repos. Consider caching the parsed "Affects" commit list per project repo (or pre-indexing once per run) and reusing it across components.
| // applyOverlaysToSources writes the macros file and then applies all overlays and | ||
| // records synthetic git history. | ||
| func (p *sourcePreparerImpl) applyOverlaysToSources( |
There was a problem hiding this comment.
The comment on applyOverlaysToSources says it "records synthetic git history", but this helper currently only writes the macros file and applies overlays; synthetic history is generated separately in PrepareSources via trySyntheticHistory. Update the comment (or move the history generation here) so callers don’t assume history is being created.
These changes enable the automatic generation of a synthetic Git repository that merges a component’s upstream Git history with additional commits layered from the Azure Linux configuration repository. Commits pertaining to a component in the Azure Linux configuration repository are expected to have an explicit
Affects: <component-name>to apply the upstream commit into the generated commit history for that component.This pull request refactors and enhances the source preparation logic for components, focusing on overlay application and synthetic git history generation. The changes improve modularity, clarity, and reliability of overlay handling, and introduce better support for preserving git history when overlays are applied. Additionally, the pull request updates dependencies and adapts tests to the new interfaces.
Core logic refactoring and enhancements:
sourceprep.goby splitting it into smaller, focused methods: overlays are now collected and applied in a defined order, and synthetic git history is generated in a dedicated step. Overlay application is now decoupled from git history generation, ensuring overlays are always applied, even if no git repository is present. [1] [2] [3].gitdirectory when overlays are applied, enabling synthetic history generation for release numbering and delta builds.postProcessSourcesmethod, replacing it with modular helpers for overlay collection, application, and spec path resolution. [1] [2]Test and interface updates:
FetchComponentinterface, which now accepts variadic options to support features like.gitdirectory preservation. [1] [2] [3] [4]