feat: upload CI logs to S3, CI gets called every midnight by redpanda-f · Pull Request #66 · FilOzone/foc-devnet

redpanda-f · 2026-02-27T07:22:01Z

No description provided.

Copilot

Pull request overview

This PR adds functionality to upload CI run state and logs to AWS S3 for post-run inspection and debugging. The feature is designed to help diagnose test failures by preserving the complete state directory from CI runs.

Changes:

Adds AWS CLI installation step that checks if AWS CLI is already present before installing
Adds S3 upload step that syncs ~/.foc-devnet/state/latest directory to S3 with a structured path including branch name, run ID, and run attempt number

.github/workflows/ci.yml

Copilot · 2026-02-27T07:39:38Z

@redpanda-f I've opened a new pull request, #67, to work on those changes. Once the pull request is ready, I'll request review from you.

…step (#67) * Initial plan * Consolidate AWS CLI install and S3 upload into a single conditional step Co-authored-by: redpanda-f <181817029+redpanda-f@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: redpanda-f <181817029+redpanda-f@users.noreply.github.com>

BigLep

If pushing logs to s3 is indeed required vs. whatever log retention we get from GitHub (#65 (comment)), then I'd like to see documentation on:

Why we're doing this? (Again, this may be totally valid, but I haven't seen the motivation written anywhere)
How someone accesses the logs? I would this bucket to be publicly accessible (just as github action logs are accessible) so there is no barrier for someone to access it quickly

How about we also define/create the bucket in code so that something like the retention policy on the bucket is also set and easily discoverable from looking here vs. needing to query AWS.

BigLep · 2026-02-28T23:03:16Z

Also, I see there are changes to ci.yml itself. Is that structurally right? I though ci.yml was for validating foc-devnet itself, but it wasn't where we'd be executing tests for validating FOC.

In #8 (comment) there was discussion of creating a new repo for running FOC integration tests. It's ok if the plan has changed, but that should be documented/communicated. I think part of what's missing here is:

I also think it will be helpful to have a comment in foc-localnet ci.yml about its purpose and what's in scope and not in scope for that CI job so others don't get confused.

(source)

redpanda-f · 2026-03-01T01:35:01Z

If pushing logs to s3 is indeed required vs. whatever log retention we get from GitHub (#65 (comment)), then I'd like to see documentation on:

Why we're doing this? (Again, this may be totally valid, but I haven't seen the motivation written anywhere)

How someone accesses the logs? I would this bucket to be publicly accessible (just as github action logs are accessible) so there is no barrier for someone to access it quickly

GH actions have a retention policy for anwhere upto 90 days (ref: https://docs.github.com/en/organizations/managing-organization-settings/configuring-the-retention-period-for-github-actions-artifacts-and-logs-in-your-organization). We may want longer retention, both in case the issues raised due to failed nightlies are deprioritized aggressively and picked later, or for further reference of the issue. Hence S3 sounds like a good choice.
The bucket is currently publicly accessible to everyone. For example, here is from one of the runs: https://filoz-foc-devnet.s3.ap-southeast-1.amazonaws.com/runs/feat-redpanda-s3-uploads/22481882552/1/foc_metadata.json

How about we also define/create the bucket in code so that something like the retention policy on the bucket is also set and easily discoverable from looking here vs. needing to query AWS.

That is doable, for now I have done it manually. However feel free to let me know if you want in scope, as part of #8 :

s3 bucket creation IaC
s3 lifecycle / retention policy of logs

redpanda-f · 2026-03-01T01:45:35Z

Also, I see there are changes to ci.yml itself. Is that structurally right? I though ci.yml was for validating foc-devnet itself, but it wasn't where we'd be executing tests for validating FOC.

In #8 (comment) there was discussion of creating a new repo for running FOC integration tests. It's ok if the plan has changed, but that should be documented/communicated. I think part of what's missing here is:

I also think it will be helpful to have a comment in foc-localnet ci.yml about its purpose and what's in scope and not in scope for that CI job so others don't get confused.

(source)

Current changes in ci.yml is a proof that S3 uploading works and not much more. This will later be useful as a standalone action that works in tandem with a separate common devnet setup action (.github/actions/devnet-setup/action.yml) available as part of #62. That separate action abstracts away the "foc-devnet start process" and gives a common steps that any ci.yml or e2e_some_specific_test.yml can use.

In fact with that, it become slightly clear what ci.yml entails which is not much more than "does the foc-devnet start correctly". (see .github/workflows/ci.yml) in #62

I am currently attempting e2e tests as separate workflows, as described above in #62 (WIP). That way, we get the "code separation" from foc-devnet anyways, while not really having to create a separate repo.

galargh · 2026-03-01T12:27:16Z

FYI, there is also this little advertised feature of our custom github runners where they all have access to an s3 bucket for storing artifacts. It shouldn't be used as cache because it doesn't have things like poisoning protections, etc. but it can and in some cases is used an artifacts storage (e.g. kubo uploads html reports to it so that they can be rendered in the browser). By default it has 90 days retention set, same like gh artifacts, but we have full control over it obviously. There are these 2 helper actions for interacting with it - download-artifact and upload-artifact in https://github.com/ipdxco/custom-github-runners/tree/main/.github/actions

redpanda-f · 2026-03-01T12:34:05Z

FYI, there is also this little advertised feature of our custom github runners where they all have access to an s3 bucket for storing artifacts. It shouldn't be used as cache because it doesn't have things like poisoning protections, etc. but it can and in some cases is used an artifacts storage (e.g. kubo uploads html reports to it so that they can be rendered in the browser). By default it has 90 days retention set, same like gh artifacts, but we have full control over it obviously. There are these 2 helper actions for interacting with it - download-artifact and upload-artifact in https://github.com/ipdxco/custom-github-runners/tree/main/.github/actions

That sounds actually very nice, and almost exactly what we need. Can you describe more on HTML and rendering on browser? do we have a web server fronting this as well? Would be useful to use that for our nightly reports.

Also, do we have an upper limit on the retention period?

Will get in touch with you. This sounds like a better direction.

galargh · 2026-03-01T12:45:32Z

That sounds actually very nice, and almost exactly what we need. Can you describe more on HTML and rendering on browser? do we have a web server fronting this as well? Would be useful to use that for our nightly reports.

No server, just for static websites. Let me show you an example. Here, https://github.com/ipfs/kubo/actions/runs/22498049626/attempts/1#summary-65177630169, we have workflow run summary (that's just an md file - $GITHUB_STEP_SUMMARY - you can write to from your job). There, you'll see two links:

one pager report - https://custom-github-runners-v1-ipshipyard.s3.amazonaws.com/multi-linux-x64-4xlarge-action-runner/ipfs/kubo/22498049626/1/sharness.html
full html report with child sites and all - https://custom-github-runners-v1-ipshipyard.s3.amazonaws.com/multi-linux-x64-4xlarge-action-runner/ipfs/kubo/22498049626/1/sharness-html/index.html

They're both static websites, s3 handles the rendering bit. And you can upload them like this:

Also, do we have an upper limit on the retention period?

Nope, we can set it to whatever we want or not set it at all.

redpanda-f · 2026-03-02T04:06:20Z

S3 is not madatory for what we are doing right now, although necessary in longer term. Will close: #66 and will rely on log retention of GH action for now.
Follow up tasks would transition to our own S3 or IPDX's S3 buckets.

feat: upload CI logs to S3

f64d43d

Copilot AI review requested due to automatic review settings February 27, 2026 07:22

FilOzzy added this to FOC Feb 27, 2026

github-project-automation bot moved this to 📌 Triage in FOC Feb 27, 2026

Copilot started reviewing on behalf of redpanda-f February 27, 2026 07:22 View session

redpanda-f mentioned this pull request Feb 27, 2026

CI/Nightly: Introduce logs dumping to S3 Buckets #65

Open

Copilot AI reviewed Feb 27, 2026

View reviewed changes

redpanda-f self-assigned this Feb 27, 2026

redpanda-f changed the title ~~feat: upload CI logs to S3~~ feat: upload CI logs to S3, CI gets called every midnight Feb 27, 2026

Copilot AI mentioned this pull request Feb 27, 2026

Consolidate AWS CLI install and S3 upload into single conditional CI step #67

Merged

Copilot AI and others added 2 commits February 27, 2026 13:12

feat: CI gets called every midnight

40a97a0

redpanda-f marked this pull request as draft February 27, 2026 08:02

redpanda-f added 5 commits February 27, 2026 08:34

fix: CI_LOGS_BUCKET uses secrets

d1499f2

fix: early return when AWS not configured correctly

2a02de1

fix: AWS_DEFAULT_REGION

a69a34d

fix: BRANCH name corrections

a487fc1

fix: no-test should be notest

b27f07d

rjan90 moved this from 📌 Triage to ⌨️ In Progress in FOC Feb 27, 2026

rjan90 added this to the M4.1: mainnet ready milestone Feb 27, 2026

rjan90 linked an issue Feb 27, 2026 that may be closed by this pull request

CI/Nightly: Introduce logs dumping to S3 Buckets #65

Open

redpanda-f marked this pull request as ready for review February 27, 2026 12:55

redpanda-f requested a review from rvagg February 27, 2026 12:55

redpanda-f mentioned this pull request Feb 27, 2026

feat: slack reporting #68

Closed

BigLep requested changes Feb 28, 2026

View reviewed changes

galargh mentioned this pull request Mar 1, 2026

CI/Nightly End-to-End Validation of FOC as a whole #8

Open

redpanda-f closed this Mar 2, 2026

github-project-automation bot moved this from ⌨️ In Progress to 🎉 Done in FOC Mar 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: upload CI logs to S3, CI gets called every midnight#66

feat: upload CI logs to S3, CI gets called every midnight#66
redpanda-f wants to merge 8 commits intomainfrom
feat/redpanda/s3-uploads

redpanda-f commented Feb 27, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Feb 27, 2026

Uh oh!

BigLep left a comment

Uh oh!

BigLep commented Feb 28, 2026

Uh oh!

redpanda-f commented Mar 1, 2026

Uh oh!

redpanda-f commented Mar 1, 2026

Uh oh!

galargh commented Mar 1, 2026

Uh oh!

redpanda-f commented Mar 1, 2026 •

edited

Loading

Uh oh!

galargh commented Mar 1, 2026

Uh oh!

redpanda-f commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

redpanda-f commented Feb 27, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Feb 27, 2026

Uh oh!

BigLep left a comment

Choose a reason for hiding this comment

Uh oh!

BigLep commented Feb 28, 2026

Uh oh!

redpanda-f commented Mar 1, 2026

Uh oh!

redpanda-f commented Mar 1, 2026

Uh oh!

galargh commented Mar 1, 2026

Uh oh!

redpanda-f commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

galargh commented Mar 1, 2026

Uh oh!

redpanda-f commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

redpanda-f commented Mar 1, 2026 •

edited

Loading