Skip to content

Commit 22850ef

Browse files
committed
ci: reuse toolchain payloads in e2e
1 parent 184059b commit 22850ef

6 files changed

Lines changed: 339 additions & 20 deletions

File tree

Lines changed: 225 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,225 @@
1+
# CI 工具链缓存优化分析
2+
3+
> 日期: 2026-06-01
4+
> 状态: draft
5+
> 范围: Linux CI/e2e 中工具链 registry 与 xpkg payload 的重复下载、临时 `MCPP_HOME` 隔离策略,以及后续可扩展的工具链缓存架构。
6+
7+
## 0. 结论
8+
9+
当前 CI 慢点不是单纯的 actions/cache 命中率问题。Linux workflow 已经缓存了 `~/.mcpp``~/.xlings`,但部分 e2e 脚本为了隔离配置会创建新的 `MCPP_HOME`。如果这些临时 home 没有继承已安装的 xpkg payload,测试内部的 `mcpp build` / first-run auto-install 就会把同一份大工具链下载到临时目录,测试结束后又被删除。
10+
11+
最典型的触发链:
12+
13+
```text
14+
ci-linux
15+
-> persistent ~/.mcpp 已有/将安装 gcc、musl-gcc
16+
-> tests/e2e/run_all.sh
17+
-> 29_toolchain_partial_versions.sh 创建 $TMP/h2
18+
-> first-run auto-install gcc@15.1.0-musl 到 $TMP/h2
19+
-> trap 删除 $TMP/h2
20+
-> 31_transitive_deps.sh 再创建 $TMP/mcpp-home
21+
-> 再次安装/下载 gcc@15.1.0-musl
22+
-> 后续 "Toolchain: musl-gcc" step 在 persistent ~/.mcpp 里还可能再安装一次
23+
```
24+
25+
合理修法不是取消测试隔离,而是把隔离拆成两层:
26+
27+
1. 配置状态隔离: 每个测试仍可拥有自己的 `config.toml`、lock/cache、project state。
28+
2. 工具链 payload 复用: 大体积、只读的 `registry/data/xpkgs` 从 persistent sandbox 或 xlings cache 继承。
29+
30+
## 1. 现状证据
31+
32+
### 1.1 workflow 已经缓存 persistent sandbox
33+
34+
- `.github/workflows/ci-linux.yml`: `Cache mcpp sandbox` 缓存 `~/.mcpp`,用于保留 musl-gcc、binutils、glibc、linux-headers、patchelf、ninja 等 payload。
35+
- 同一个 workflow 也缓存 `~/.xlings`,用于保留 xlings 自己安装的包。
36+
- E2E step 会设置 `MCPP_HOME=/home/runner/.mcpp`,并设置 `MCPP_E2E_TOOLCHAIN_MIRROR=GLOBAL`
37+
38+
所以 CI 的正确方向应该是让临时 home 继承这两处 cache,而不是在临时 home 冷启动。
39+
40+
### 1.2 e2e 的临时 home 有两种语义
41+
42+
有些脚本创建临时 `MCPP_HOME` 是为了验证“空配置”或“fresh sandbox”行为,例如:
43+
44+
- `14_toolchain_fallback.sh`: 验证无 toolchain 且 `MCPP_NO_AUTO_INSTALL=1` 时的硬错误。
45+
- `26_toolchain_management.sh`: 显式验证 `toolchain install/list/default/remove`
46+
- `29_toolchain_partial_versions.sh`: 第一段验证 partial version/default 解析,第二段验证 first-run auto-install。
47+
48+
这类测试不能直接复制全局 `config.toml`,否则会掩盖被测行为。但它们通常可以继承 `registry/data/xpkgs`,因为 payload 是否已经存在不应改变“配置为空时会设置默认 toolchain”的语义。
49+
50+
另一些脚本创建临时 `MCPP_HOME` 只是为了隔离 BMI、git/cache 或测试产物,例如:
51+
52+
- `31_transitive_deps.sh`: 目标是验证传递依赖 include_dirs,不是验证工具链安装。
53+
- LLVM/import std/BMI cache 类测试: 目标是编译行为或 cache 行为,不是下载行为。
54+
55+
这类测试应该默认继承 payload 和必要配置。
56+
57+
### 1.3 `_inherit_toolchain.sh` 的模型不完整
58+
59+
旧 helper 只优先继承 `$HOME/.mcpp/registry/data/xpkgs`,但 `run_all.sh` 的能力检测同时承认:
60+
61+
- `$HOME/.xlings/data/xpkgs/xim-x-musl-gcc/...`
62+
- `$MCPP_HOME/registry/data/xpkgs/xim-x-musl-gcc/...`
63+
64+
这会导致能力检测认为 musl 可用,但临时 `MCPP_HOME` helper 没有把 `.xlings` payload 继承进去,脚本实际构建时仍可能走下载路径。
65+
66+
### 1.4 慢点掩盖了 31 的真实功能失败
67+
68+
Linux PR CI 的失败链显示 `31_transitive_deps.sh` 先在临时 home 下载了 `xim:musl-gcc@15.1.0` 的 808 MB payload,然后才失败在:
69+
70+
```text
71+
child/ch/src/ch.cppm:2:10: fatal error: gc/gc.h: No such file or directory
72+
```
73+
74+
这说明 CI 慢不是唯一问题。即使下载复用做好,31 仍会失败,因为依赖 include_dirs 的传播模型也不完整: 依赖解析只把 dep 的 include_dirs 追加到 root manifest,而没有追加到实际发起依赖的 consumer package。`top -> ch -> gc` 里真正需要 `<gc/gc.h>` 的是 `ch`,不是 root `top`
75+
76+
## 2. 本 PR 的优化策略
77+
78+
### 2.1 payload 继承从“整目录 symlink”改为“逐 package merge”
79+
80+
`tests/e2e/_inherit_toolchain.sh` 现在会从以下来源继承 xpkg payload:
81+
82+
- `$HOME/.mcpp/registry/data/xpkgs`
83+
- `$HOME/.xlings/data/xpkgs`
84+
- Windows/Git Bash 下的 `$USERPROFILE/.xlings/data/xpkgs`
85+
86+
目标目录是临时 home 的:
87+
88+
```text
89+
$MCPP_HOME/registry/data/xpkgs
90+
```
91+
92+
逐 package merge 比整目录 symlink 更稳,因为 CI 可能同时存在两套 payload 来源: mcpp 自己安装的 toolchain 在 `~/.mcpp`,xlings bootstrap 安装的包在 `~/.xlings`。逐项链接可以把两边合并进临时 home,而不会因为先 symlink 了一个根目录导致另一个来源无法补充。
93+
94+
### 2.2 first-run 测试只继承 payload,不继承配置
95+
96+
`29_toolchain_partial_versions.sh` 的 second home 继续保持:
97+
98+
```text
99+
无 config/default state
100+
无 inherited subos
101+
```
102+
103+
但会继承 xpkg payload。这样仍能验证 first-run auto-install 的用户语义:
104+
105+
- 生成项目没有 `[toolchain]`
106+
- 第一次 `mcpp build` 会出现 First run。
107+
- 默认选择 `gcc@15.1.0-musl`
108+
- default 会被持久化,第二次 build 不再打印 First run。
109+
110+
区别是 install 阶段可以发现 payload 已存在,不再把大归档下载到临时目录。
111+
112+
### 2.3 传递依赖测试不再承担工具链冷启动
113+
114+
`31_transitive_deps.sh` 的目标是验证:
115+
116+
- top 只声明 child。
117+
- child 自己声明 grandchild。
118+
- grandchild 的 `[build].include_dirs` 能传到 child 编译命令。
119+
120+
这个测试不应该下载工具链。现在它会继承 payload-only,并在没有可复用 musl xpkg payload 时直接 skip。musl 的安装/构建路径由 workflow 的专门 toolchain step 覆盖。
121+
122+
### 2.4 Linux CI 预热一次 musl-gcc
123+
124+
Linux e2e step 在运行 `tests/e2e/run_all.sh` 前执行:
125+
126+
```bash
127+
"$MCPP" toolchain install gcc 15.1.0-musl
128+
```
129+
130+
这有两个作用:
131+
132+
1. 让 29/31 这类临时 home 测试通过 helper 复用 persistent payload。
133+
2. 让后续 `"Toolchain: musl-gcc — build mcpp (--target)"` 复用同一份安装。
134+
135+
冷 cache 时最多下载一次 musl-gcc;热 cache 时这个命令应快速命中本地 payload。
136+
137+
### 2.5 include_dirs 按 dependency edge 传播
138+
139+
`src/cli.cppm` 的依赖解析现在把 include_dirs 当作 edge 属性处理:
140+
141+
```text
142+
consumer package -> dependency package
143+
```
144+
145+
每个 unique dependency 仍只解析/扫描一次,但每个 consumer 都会获得该 dependency 的 public include dirs。这样:
146+
147+
- root 直接依赖 header-providing package 时,root compile units 能看到 headers。
148+
- child 依赖 grandchild 时,child compile units 能看到 grandchild headers。
149+
- 同一个 dependency 被多个 package 复用时,每条边都能得到 include dirs,而不会因为 resolved map 命中就跳过传播。
150+
151+
这比“全部追加到 root 全局 flags”更接近 package-owned build metadata 的长期方向,也避免传递依赖的 header 只在 root 上可见、在真正 consumer 上不可见。
152+
153+
## 3. 通用架构建议
154+
155+
### 3.1 把 e2e home 分成三种模式
156+
157+
建议后续显式化 e2e helper API:
158+
159+
```bash
160+
source tests/e2e/_home.sh payload-only
161+
source tests/e2e/_home.sh payload-and-config
162+
source tests/e2e/_home.sh empty
163+
```
164+
165+
语义:
166+
167+
| 模式 | 继承 xpkgs | 继承 config | 用途 |
168+
|---|---:|---:|---|
169+
| `payload-only` ||| first-run、空配置、install/default 语义测试 |
170+
| `payload-and-config` ||| 普通编译、BMI、dependency、import std 测试 |
171+
| `empty` ||| 专门验证冷启动、错误提示、install 下载路径 |
172+
173+
这样每个测试脚本不用手写 `MCPP_INHERIT_CONFIG=0 MCPP_INHERIT_SUBOS=0`,也能避免未来新增测试重新引入冷下载。
174+
175+
### 3.2 把“下载路径测试”集中到少数专门 job
176+
177+
大体积工具链下载只应该出现在这些地方:
178+
179+
1. `26_toolchain_management.sh`: CLI install/list/default/remove。
180+
2. Linux workflow 的 toolchain matrix: GCC、musl-gcc、LLVM。
181+
3. fresh-install workflow: 验证发布包在空环境中的安装体验。
182+
183+
其他 e2e 默认应复用 payload。这样失败定位也更清楚:
184+
185+
- 下载失败: 看 toolchain/fresh-install job。
186+
- build/module/dependency 失败: 看 e2e。
187+
188+
### 3.3 cache key 和 install marker 要区分 payload 与配置
189+
190+
长期建议把工具链 install 状态拆开:
191+
192+
```text
193+
registry/data/xpkgs/<pkg>/<version>/ # payload, content-addressable-ish
194+
registry/toolchains/<name>@<version>.json # mcpp view: compiler path, target, stdlib, source payload
195+
config.toml # user default and mirror
196+
```
197+
198+
临时 home 可以安全 symlink/copy payload,但不必继承 default toolchain。`toolchain install` 应该在 payload 已存在时只补 mcpp 的 toolchain metadata,不重新下载。
199+
200+
### 3.4 CI 可观测性
201+
202+
建议后续给 e2e runner 增加轻量统计:
203+
204+
```text
205+
downloads_before=<count>
206+
downloads_after=<count>
207+
toolchain_install_seconds=<duration>
208+
```
209+
210+
可以先用日志 grep 实现:
211+
212+
- `Downloading xim:`
213+
- `Downloading compat.`
214+
- `Installing ...`
215+
216+
目标不是精确计费,而是让 PR 上能直接看到“这次 e2e 是否触发了工具链冷下载”。
217+
218+
## 4. 验证计划
219+
220+
本 PR 应至少验证:
221+
222+
1. `bash -n tests/e2e/_inherit_toolchain.sh tests/e2e/29_toolchain_partial_versions.sh tests/e2e/31_transitive_deps.sh`
223+
2. `29_toolchain_partial_versions.sh` 日志不再在临时 home 冷下载 musl-gcc。
224+
3. `31_transitive_deps.sh` 在可复用 musl payload 存在时通过;不存在时 skip,而不是下载。
225+
4. Linux CI e2e 和后续 musl target step 都通过。

.github/workflows/ci-linux.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,10 @@ jobs:
134134
# default-toolchain path) gets a deterministic GNU answer
135135
# instead of whatever auto-install picks on a fresh sandbox.
136136
"$MCPP" toolchain default gcc@16.1.0
137+
# Warm musl once in the persistent sandbox. Fresh-home e2e tests
138+
# inherit this payload, and the later --target musl job reuses it
139+
# instead of downloading a second copy into another home.
140+
"$MCPP" toolchain install gcc 15.1.0-musl
137141
bash tests/e2e/run_all.sh
138142
139143
- name: Save freshly-built mcpp for toolchain tests

src/cli.cppm

Lines changed: 69 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1874,30 +1874,74 @@ prepare_build(bool print_fingerprint,
18741874
return std::pair{effRoot, std::move(*manifest)};
18751875
};
18761876

1877-
// Append a dep's [build].include_dirs onto the main manifest's, glob-
1878-
// expanded against the dep's root. Returns the absolute paths actually
1879-
// appended so the caller can later evict them on a SemVer-merge re-fetch.
1880-
auto propagateIncludeDirs = [&](const std::filesystem::path& depRoot,
1881-
const mcpp::manifest::Manifest& depManifest)
1877+
auto appendIncludeDirsTo = [&](mcpp::manifest::Manifest& target,
1878+
const std::filesystem::path& depRoot,
1879+
const mcpp::manifest::Manifest& depManifest)
18821880
-> std::vector<std::filesystem::path>
18831881
{
18841882
std::vector<std::filesystem::path> added;
1883+
auto append_unique = [&](const std::filesystem::path& dir) {
1884+
auto& dirs = target.buildConfig.includeDirs;
1885+
if (std::find(dirs.begin(), dirs.end(), dir) != dirs.end()) return;
1886+
dirs.push_back(dir);
1887+
added.push_back(dir);
1888+
};
18851889
for (auto& inc : depManifest.buildConfig.includeDirs) {
18861890
if (inc.is_absolute()) {
1887-
m->buildConfig.includeDirs.push_back(inc);
1888-
added.push_back(inc);
1891+
append_unique(inc);
18891892
continue;
18901893
}
18911894
auto matches = mcpp::modgraph::expand_dir_glob(depRoot, inc.generic_string());
18921895
if (matches.empty()) continue;
18931896
for (auto& d : matches) {
1894-
m->buildConfig.includeDirs.push_back(d);
1895-
added.push_back(d);
1897+
append_unique(d);
18961898
}
18971899
}
18981900
return added;
18991901
};
19001902

1903+
auto syncMainPackageIncludes = [&] {
1904+
if (!packages.empty()) {
1905+
packages[0].manifest.buildConfig.includeDirs = m->buildConfig.includeDirs;
1906+
}
1907+
};
1908+
1909+
// Append a dep's [build].include_dirs onto the main manifest's, glob-
1910+
// expanded against the dep's root. Returns the absolute paths actually
1911+
// appended so the caller can later evict them on a SemVer-merge re-fetch.
1912+
auto propagateIncludeDirs = [&](const std::filesystem::path& depRoot,
1913+
const mcpp::manifest::Manifest& depManifest)
1914+
-> std::vector<std::filesystem::path>
1915+
{
1916+
auto added = appendIncludeDirsTo(*m, depRoot, depManifest);
1917+
syncMainPackageIncludes();
1918+
return added;
1919+
};
1920+
1921+
auto propagateIncludeDirsToConsumer =
1922+
[&](std::size_t consumerDepIndex,
1923+
const std::filesystem::path& depRoot,
1924+
const mcpp::manifest::Manifest& depManifest)
1925+
{
1926+
if (consumerDepIndex == kMainConsumer) {
1927+
(void)propagateIncludeDirs(depRoot, depManifest);
1928+
return;
1929+
}
1930+
if (consumerDepIndex >= dep_manifests.size()
1931+
|| consumerDepIndex + 1 >= packages.size()) {
1932+
return;
1933+
}
1934+
auto added = appendIncludeDirsTo(*dep_manifests[consumerDepIndex],
1935+
depRoot, depManifest);
1936+
auto& packageManifest = packages[consumerDepIndex + 1].manifest;
1937+
for (auto const& dir : added) {
1938+
auto& dirs = packageManifest.buildConfig.includeDirs;
1939+
if (std::find(dirs.begin(), dirs.end(), dir) == dirs.end()) {
1940+
dirs.push_back(dir);
1941+
}
1942+
}
1943+
};
1944+
19011945
// Drop earlier include_dirs that came from a now-superseded dep version.
19021946
// Erases by value match — safe because the outer code only ever appends,
19031947
// and on re-fetch we re-record the new entries afterwards.
@@ -1907,6 +1951,7 @@ prepare_build(bool print_fingerprint,
19071951
auto pos = std::find(dirs.begin(), dirs.end(), p);
19081952
if (pos != dirs.end()) dirs.erase(pos);
19091953
}
1954+
syncMainPackageIncludes();
19101955
};
19111956

19121957
auto normalizeDepLdflag = [](const std::filesystem::path& depRoot,
@@ -2297,7 +2342,16 @@ prepare_build(bool print_fingerprint,
22972342
continue;
22982343
}
22992344
// Same key, same version (or compatible path/git) — already
2300-
// processed; skip.
2345+
// processed; still attach its public include dirs to this
2346+
// consumer before skipping. Include propagation is per edge, not
2347+
// per unique package: two consumers can need the same dep's
2348+
// headers even though the dep itself is fetched/scanned once.
2349+
if (it->second.depIndex + 1 < packages.size()) {
2350+
auto const& existing = packages[it->second.depIndex + 1];
2351+
propagateIncludeDirsToConsumer(item.consumerDepIndex,
2352+
existing.root,
2353+
existing.manifest);
2354+
}
23012355
continue;
23022356
}
23032357

@@ -2422,6 +2476,11 @@ prepare_build(bool print_fingerprint,
24222476
// by depIndex (the SemVer merger needs to overwrite the slot).
24232477
dep_manifests.push_back(
24242478
std::make_unique<mcpp::manifest::Manifest>(std::move(*dep_manifest)));
2479+
if (item.consumerDepIndex != kMainConsumer) {
2480+
propagateIncludeDirsToConsumer(item.consumerDepIndex,
2481+
dep_root,
2482+
*dep_manifests.back());
2483+
}
24252484
dep_cache_identities.push_back({
24262485
.indexName = cache_index_name(key.ns),
24272486
.packageName = name,

tests/e2e/29_toolchain_partial_versions.sh

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -50,10 +50,13 @@ grep -q 'gcc@16.1.0' "$TMP/def2.log" || {
5050
cat "$TMP/def2.log"; echo "default 'gcc@16' didn't resolve to 16.1.0"; exit 1; }
5151

5252
# ─── Section 2: first-run auto-install ──────────────────────────────────
53-
# Brand-new MCPP_HOME, brand-new package with no [toolchain] declared —
54-
# `mcpp build` should auto-install the canonical default (musl-gcc 15.1
55-
# for portable static binaries) + use it. Output should be a static ELF.
53+
# Brand-new MCPP_HOME with no config/default state, brand-new package with no
54+
# [toolchain] declared — `mcpp build` should auto-install the canonical
55+
# default (musl-gcc 15.1 for portable static binaries) + use it. We still
56+
# inherit payloads so CI does not download the same large archives into a
57+
# throw-away home.
5658
export MCPP_HOME="$TMP/h2"
59+
inherit_payloads_only
5760
configure_e2e_mirror
5861
mkdir -p "$TMP/proj"
5962
cd "$TMP/proj"

tests/e2e/31_transitive_deps.sh

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,14 @@ set -e
1010
TMP=$(mktemp -d)
1111
trap "rm -rf $TMP" EXIT
1212
export MCPP_HOME="$TMP/mcpp-home"
13+
MCPP_INHERIT_CONFIG=0 MCPP_INHERIT_SUBOS=0 source "$(dirname "$0")/_inherit_toolchain.sh"
14+
15+
MUSL_PAYLOAD="$MCPP_HOME/registry/data/xpkgs/xim-x-musl-gcc/15.1.0/bin/x86_64-linux-musl-g++"
16+
if [[ ! -x "$MUSL_PAYLOAD" ]]; then
17+
echo "SKIP: no reusable musl-gcc xpkg payload"
18+
echo "OK"
19+
exit 0
20+
fi
1321

1422
# ── 1. Grandchild: a header-providing C lib whose `[build].include_dirs`
1523
# is what consumers care about. Plays the role of mbedtls in the

0 commit comments

Comments
 (0)