test: add workDir tmpfs hiding integration tests#1219
Conversation
Add 7 integration tests verifying that the tmpfs overlay on workDir prevents agents from reading docker-compose.yml which contains plaintext tokens (GITHUB_TOKEN, ANTHROPIC_API_KEY, etc.). Tests cover: - docker-compose.yml not readable in workDir (normal mode) - workDir appears empty to the agent - Sensitive env vars not leaked via workDir files - docker-compose.yml not readable at /host path (chroot mode) - /host workDir also appears empty - grep for secrets in workDir finds nothing - Debug logs confirm tmpfs overlay configuration Fixes #759 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
✅ Coverage Check PassedOverall Coverage
📁 Per-file Coverage Changes (1 files)
Coverage comparison generated by |
There was a problem hiding this comment.
CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
There was a problem hiding this comment.
Pull request overview
Adds a new integration test suite to validate the security mitigation introduced in #718: a tmpfs overlay on workDir prevents the agent container from reading docker-compose.yml (and thus plaintext secrets).
Changes:
- Introduces 7 new integration tests targeting workDir tmpfs hiding behavior in normal and chroot (
/host) paths. - Adds “security verification” style checks (canary value + grep for secret patterns + log-based assertion).
Comments suppressed due to low confidence (1)
tests/integration/workdir-tmpfs-hiding.test.ts:153
- Test 6 claims “grep for secrets … finds nothing”, but the assertions only check for
ghp_test_token_12345andGITHUB_TOKEN, and the command truncates output withhead -5. This can miss leaks of other patterns being searched (e.g.ANTHROPIC_API_KEY,COPILOT_GITHUB_TOKEN,_authToken) or matches beyond the first 5 lines. Consider asserting the filtered output is empty (or otherwise fail the command if any match is found) rather than checking only a subset.
test('Test 6: grep for secrets in workDir finds nothing', async () => {
// Simulate an attack: search for common secret patterns in any awf workDir
const result = await runner.runWithSudo(
'sh -c \'grep -r "GITHUB_TOKEN\\|ANTHROPIC_API_KEY\\|COPILOT_GITHUB_TOKEN\\|_authToken" /tmp/awf-*/ 2>&1 || true\' | grep -v "^\\[" | head -5',
{
allowDomains: ['github.com'],
logLevel: 'debug',
timeout: 60000,
cliEnv: { GITHUB_TOKEN: 'ghp_test_token_12345' },
envAll: true,
}
);
// Should not find any secrets
const output = result.stdout.trim();
expect(output).not.toContain('ghp_test_token_12345');
expect(output).not.toContain('GITHUB_TOKEN');
}, 120000);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| test('Test 7: debug logs confirm tmpfs overlay is configured', async () => { | ||
| const result = await runner.runWithSudo( | ||
| 'echo "test"', | ||
| { | ||
| allowDomains: ['github.com'], | ||
| logLevel: 'debug', | ||
| timeout: 60000, | ||
| } | ||
| ); | ||
|
|
||
| expect(result).toSucceed(); | ||
| // Debug logs should show tmpfs configuration | ||
| expect(result.stderr).toMatch(/tmpfs/i); | ||
| }, 120000); |
There was a problem hiding this comment.
Test 7 expects debug logs to contain the string tmpfs, but the current implementation doesn’t appear to log tmpfs configuration explicitly (the tmpfs setup is in the generated compose YAML, not a log line). This makes the test brittle and likely to fail depending on log verbosity. A more reliable integration assertion would be to inspect mounts inside the container (e.g., via /proc/mounts) or to preserve workDir and verify the generated docker-compose.yml contains the expected tmpfs: entries.
| test('Test 1: docker-compose.yml is not readable in workDir', async () => { | ||
| // Run AWF with a command that tries to find and read docker-compose.yml | ||
| // The workDir is /tmp/awf-<timestamp>, so we glob for it | ||
| const result = await runner.runWithSudo( | ||
| 'sh -c \'for d in /tmp/awf-*/; do if [ -f "$d/docker-compose.yml" ]; then cat "$d/docker-compose.yml"; echo "FOUND_COMPOSE"; fi; done\'', | ||
| { | ||
| allowDomains: ['github.com'], | ||
| logLevel: 'debug', | ||
| timeout: 60000, | ||
| } | ||
| ); | ||
|
|
||
| // The tmpfs overlay makes the workDir appear empty, | ||
| // so docker-compose.yml should not be found | ||
| const output = result.stdout.trim(); | ||
| expect(output).not.toContain('FOUND_COMPOSE'); | ||
| expect(output).not.toContain('services:'); | ||
| expect(output).not.toContain('GITHUB_TOKEN'); | ||
| expect(output).not.toContain('ANTHROPIC_API_KEY'); | ||
| }, 120000); |
There was a problem hiding this comment.
Tests 1–6 never assert that the awf run itself succeeded. If awf fails early (non‑zero exit) these assertions can still pass because stdout is empty, leading to false positives. Add an explicit expect(result).toSucceed() (or equivalent exitCode check) before validating output in each test where failure isn’t an accepted outcome.
| const result = await runner.runWithSudo( | ||
| 'sh -c \'for d in /tmp/awf-*/; do if [ -f "$d/docker-compose.yml" ]; then cat "$d/docker-compose.yml"; echo "FOUND_COMPOSE"; fi; done\'', | ||
| { |
There was a problem hiding this comment.
The /tmp/awf-*/ and /host/tmp/awf-*/ globs are used under sh. When a glob doesn’t match in POSIX sh, it typically remains literal, so the for d in /tmp/awf-*/ loop can run once with the unmatched pattern and silently report “not found”, making the test pass without actually validating the tmpfs overlay. Consider deriving the exact workDir (e.g., from result.workDir) or explicitly failing the command when no /tmp/awf-* directory is present before performing the checks.
| test('Test 3: sensitive env vars are not leaked via workDir files', async () => { | ||
| // Pass a known secret via env and verify it cannot be found in workDir files | ||
| const result = await runner.runWithSudo( | ||
| 'sh -c \'find /tmp/awf-* -type f 2>/dev/null | while read f; do cat "$f" 2>/dev/null; done | grep -c "SECRET_CANARY_VALUE" || echo "0"\'', | ||
| { | ||
| allowDomains: ['github.com'], | ||
| logLevel: 'debug', | ||
| timeout: 60000, | ||
| cliEnv: { TEST_SECRET: 'SECRET_CANARY_VALUE' }, | ||
| envAll: true, | ||
| } | ||
| ); | ||
|
|
||
| // The canary value should not appear in any readable file | ||
| const output = result.stdout.trim(); | ||
| // grep -c returns "0" when no matches found | ||
| expect(output).toMatch(/^0$/m); |
There was a problem hiding this comment.
Test 3’s pipeline uses grep -c ... || echo "0". When there are no matches, grep -c prints 0 but exits with status 1, so the || echo "0" branch runs and can produce duplicate output (0\n0). It also masks unexpected grep failures as “0”. Make the command emit a single deterministic value (and treat grep errors as test failures) so the assertion can’t pass on malformed output.
This issue also appears on line 136 of the same file.
The `*/` in `awf-*/docker-compose.yml` inside the JSDoc block comment was being interpreted as the end of the comment, causing TypeScript type-check failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Smoke Test Results
Overall: PASS
|
|
Smoke Test Results
Overall: PASS
|
|
PRs: test: add workDir tmpfs hiding integration tests; chore(deps): aggregated dependency updates
|
|
Smoke test results for ✅ GitHub MCP — Last 2 merged PRs: #1160 "fix(squid): block direct IP connections that bypass domain filtering", #1157 "feat: combine all build-test workflows into single build-test.md" Overall: PASS
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Summary
docker-compose.yml(which contains plaintext tokens)/hostprefix paths)Test Coverage
Fixes #759
🤖 Generated with Claude Code