Skip to content

test: add workDir tmpfs hiding integration tests#1219

Merged
Mossaka merged 2 commits intomainfrom
feat/053-workdir-tmpfs-test
Mar 11, 2026
Merged

test: add workDir tmpfs hiding integration tests#1219
Mossaka merged 2 commits intomainfrom
feat/053-workdir-tmpfs-test

Conversation

@Mossaka
Copy link
Collaborator

@Mossaka Mossaka commented Mar 11, 2026

Summary

  • Adds 7 integration tests verifying the tmpfs overlay on workDir prevents agents from reading docker-compose.yml (which contains plaintext tokens)
  • Tests cover both normal mode and chroot mode (/host prefix paths)
  • Includes security verification tests (grep for secrets, env var canary)

Test Coverage

Test Description
Test 1 docker-compose.yml not readable in workDir
Test 2 workDir appears empty to the agent
Test 3 Env var canary value not leaked via workDir files
Test 4 docker-compose.yml not readable at /host path (chroot)
Test 5 /host workDir also appears empty
Test 6 grep for secrets in workDir finds nothing
Test 7 Debug logs confirm tmpfs overlay configuration

Fixes #759

🤖 Generated with Claude Code

Add 7 integration tests verifying that the tmpfs overlay on workDir
prevents agents from reading docker-compose.yml which contains
plaintext tokens (GITHUB_TOKEN, ANTHROPIC_API_KEY, etc.).

Tests cover:
- docker-compose.yml not readable in workDir (normal mode)
- workDir appears empty to the agent
- Sensitive env vars not leaked via workDir files
- docker-compose.yml not readable at /host path (chroot mode)
- /host workDir also appears empty
- grep for secrets in workDir finds nothing
- Debug logs confirm tmpfs overlay configuration

Fixes #759

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 11, 2026 00:57
@github-actions
Copy link
Contributor

github-actions bot commented Mar 11, 2026

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 82.37% 82.51% 📈 +0.14%
Statements 82.27% 82.41% 📈 +0.14%
Functions 82.60% 82.60% ➡️ +0.00%
Branches 74.21% 74.30% 📈 +0.09%
📁 Per-file Coverage Changes (1 files)
File Lines (Before → After) Statements (Before → After)
src/docker-manager.ts 83.4% → 84.0% (+0.54%) 82.8% → 83.3% (+0.52%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

Copy link
Contributor

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new integration test suite to validate the security mitigation introduced in #718: a tmpfs overlay on workDir prevents the agent container from reading docker-compose.yml (and thus plaintext secrets).

Changes:

  • Introduces 7 new integration tests targeting workDir tmpfs hiding behavior in normal and chroot (/host) paths.
  • Adds “security verification” style checks (canary value + grep for secret patterns + log-based assertion).
Comments suppressed due to low confidence (1)

tests/integration/workdir-tmpfs-hiding.test.ts:153

  • Test 6 claims “grep for secrets … finds nothing”, but the assertions only check for ghp_test_token_12345 and GITHUB_TOKEN, and the command truncates output with head -5. This can miss leaks of other patterns being searched (e.g. ANTHROPIC_API_KEY, COPILOT_GITHUB_TOKEN, _authToken) or matches beyond the first 5 lines. Consider asserting the filtered output is empty (or otherwise fail the command if any match is found) rather than checking only a subset.
    test('Test 6: grep for secrets in workDir finds nothing', async () => {
      // Simulate an attack: search for common secret patterns in any awf workDir
      const result = await runner.runWithSudo(
        'sh -c \'grep -r "GITHUB_TOKEN\\|ANTHROPIC_API_KEY\\|COPILOT_GITHUB_TOKEN\\|_authToken" /tmp/awf-*/ 2>&1 || true\' | grep -v "^\\[" | head -5',
        {
          allowDomains: ['github.com'],
          logLevel: 'debug',
          timeout: 60000,
          cliEnv: { GITHUB_TOKEN: 'ghp_test_token_12345' },
          envAll: true,
        }
      );

      // Should not find any secrets
      const output = result.stdout.trim();
      expect(output).not.toContain('ghp_test_token_12345');
      expect(output).not.toContain('GITHUB_TOKEN');
    }, 120000);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +155 to +168
test('Test 7: debug logs confirm tmpfs overlay is configured', async () => {
const result = await runner.runWithSudo(
'echo "test"',
{
allowDomains: ['github.com'],
logLevel: 'debug',
timeout: 60000,
}
);

expect(result).toSucceed();
// Debug logs should show tmpfs configuration
expect(result.stderr).toMatch(/tmpfs/i);
}, 120000);
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test 7 expects debug logs to contain the string tmpfs, but the current implementation doesn’t appear to log tmpfs configuration explicitly (the tmpfs setup is in the generated compose YAML, not a log line). This makes the test brittle and likely to fail depending on log verbosity. A more reliable integration assertion would be to inspect mounts inside the container (e.g., via /proc/mounts) or to preserve workDir and verify the generated docker-compose.yml contains the expected tmpfs: entries.

Copilot uses AI. Check for mistakes.
Comment on lines +41 to +60
test('Test 1: docker-compose.yml is not readable in workDir', async () => {
// Run AWF with a command that tries to find and read docker-compose.yml
// The workDir is /tmp/awf-<timestamp>, so we glob for it
const result = await runner.runWithSudo(
'sh -c \'for d in /tmp/awf-*/; do if [ -f "$d/docker-compose.yml" ]; then cat "$d/docker-compose.yml"; echo "FOUND_COMPOSE"; fi; done\'',
{
allowDomains: ['github.com'],
logLevel: 'debug',
timeout: 60000,
}
);

// The tmpfs overlay makes the workDir appear empty,
// so docker-compose.yml should not be found
const output = result.stdout.trim();
expect(output).not.toContain('FOUND_COMPOSE');
expect(output).not.toContain('services:');
expect(output).not.toContain('GITHUB_TOKEN');
expect(output).not.toContain('ANTHROPIC_API_KEY');
}, 120000);
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests 1–6 never assert that the awf run itself succeeded. If awf fails early (non‑zero exit) these assertions can still pass because stdout is empty, leading to false positives. Add an explicit expect(result).toSucceed() (or equivalent exitCode check) before validating output in each test where failure isn’t an accepted outcome.

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +46
const result = await runner.runWithSudo(
'sh -c \'for d in /tmp/awf-*/; do if [ -f "$d/docker-compose.yml" ]; then cat "$d/docker-compose.yml"; echo "FOUND_COMPOSE"; fi; done\'',
{
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The /tmp/awf-*/ and /host/tmp/awf-*/ globs are used under sh. When a glob doesn’t match in POSIX sh, it typically remains literal, so the for d in /tmp/awf-*/ loop can run once with the unmatched pattern and silently report “not found”, making the test pass without actually validating the tmpfs overlay. Consider deriving the exact workDir (e.g., from result.workDir) or explicitly failing the command when no /tmp/awf-* directory is present before performing the checks.

Copilot uses AI. Check for mistakes.
Comment on lines +80 to +96
test('Test 3: sensitive env vars are not leaked via workDir files', async () => {
// Pass a known secret via env and verify it cannot be found in workDir files
const result = await runner.runWithSudo(
'sh -c \'find /tmp/awf-* -type f 2>/dev/null | while read f; do cat "$f" 2>/dev/null; done | grep -c "SECRET_CANARY_VALUE" || echo "0"\'',
{
allowDomains: ['github.com'],
logLevel: 'debug',
timeout: 60000,
cliEnv: { TEST_SECRET: 'SECRET_CANARY_VALUE' },
envAll: true,
}
);

// The canary value should not appear in any readable file
const output = result.stdout.trim();
// grep -c returns "0" when no matches found
expect(output).toMatch(/^0$/m);
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test 3’s pipeline uses grep -c ... || echo "0". When there are no matches, grep -c prints 0 but exits with status 1, so the || echo "0" branch runs and can produce duplicate output (0\n0). It also masks unexpected grep failures as “0”. Make the command emit a single deterministic value (and treat grep errors as test failures) so the assertion can’t pass on malformed output.

This issue also appears on line 136 of the same file.

Copilot uses AI. Check for mistakes.
The `*/` in `awf-*/docker-compose.yml` inside the JSDoc block comment
was being interpreted as the end of the comment, causing TypeScript
type-check failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Contributor

Smoke Test Results

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude for issue #1219

@github-actions
Copy link
Contributor

Smoke Test Results

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude for issue #1219

@github-actions
Copy link
Contributor

PRs: test: add workDir tmpfs hiding integration tests; chore(deps): aggregated dependency updates
GitHub MCP merged PRs review: ✅
Safeinputs GH CLI PR query: ✅
Playwright title check: ✅
Tavily web search: ❌ (tool unavailable)
File write test: ✅
Bash cat verify: ✅
Discussion comment: ✅
Build (npm ci && npm run build): ✅
Overall status: FAIL

🔮 The oracle has spoken through Smoke Codex for issue #1219

@github-actions
Copy link
Contributor

Smoke test results for @Mossaka:

✅ GitHub MCP — Last 2 merged PRs: #1160 "fix(squid): block direct IP connections that bypass domain filtering", #1157 "feat: combine all build-test workflows into single build-test.md"
✅ Playwright — github.com title contains "GitHub"
✅ File Write — /tmp/gh-aw/agent/smoke-test-copilot-22931580189.txt created
✅ Bash — file verified via cat

Overall: PASS

📰 BREAKING: Report filed by Smoke Copilot for issue #1219

@github-actions
Copy link
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color 1/1 passed ✅ PASS
Go env 1/1 passed ✅ PASS
Go uuid 1/1 passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx All passed ✅ PASS
Node.js execa All passed ✅ PASS
Node.js p-limit All passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #1219 ·

@Mossaka Mossaka merged commit dda2d31 into main Mar 11, 2026
52 checks passed
@Mossaka Mossaka deleted the feat/053-workdir-tmpfs-test branch March 11, 2026 01:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Testing] Missing integration test for workDir tmpfs hiding (docker-compose.yml secret exposure)

2 participants