in_tail: fix ignore_active_older_files to re-check mtime on aged-out files#11890
in_tail: fix ignore_active_older_files to re-check mtime on aged-out files#11890zshuang0316 wants to merge 2 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRe-check aged-out files' mtimes on scan (POSIX and Windows), preserve file read offsets when purging ignored files, and add a runtime test that verifies a refreshed file is re-registered and re-read. ChangesAged-out file re-evaluation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 299b0f1665
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
299b0f1 to
5ca8e29
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/in_tail/tail_file.c (1)
447-450:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winConfirm
flb_tail_scan_register_ignored_file_sizeAPI and offset handling.
flb_tail_scan_register_ignored_file_sizeexists and is declared/defined asvoid flb_tail_scan_register_ignored_file_size(struct flb_tail_config *ctx, const char *path, size_t path_length, size_t size)(plugins/in_tail/tail_scan.h,plugins/in_tail/tail_scan.c); the call inplugins/in_tail/tail_file.cpasses the first three arguments (ctx,file->name,strlen(file->name)) correctly.- The last argument is
file->offset, which isint64_tinstruct flb_tail_file; since the API expectssize_t, ensurefile->offsetis always non-negative (or add an explicit cast/guard) to avoid sign-to-unsigned conversion issues.- The call order in the purge path is correct: register ignored offset first, then register the aged-out inode, then remove the file.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/in_tail/tail_file.c` around lines 447 - 450, The call to flb_tail_scan_register_ignored_file_size(ctx, file->name, strlen(file->name), file->offset) uses file->offset (int64_t) while the API expects a size_t; update the purge path in tail_file.c to ensure file->offset is non-negative before passing it (e.g. check file->offset >= 0 and use (size_t)file->offset) or clamp to 0 to avoid sign-to-unsigned conversion, keeping the existing call order (flb_tail_scan_register_ignored_file_size first, then flb_tail_scan_register_aged_out_inode, then flb_tail_file_remove).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/runtime/in_tail.c`:
- Around line 2094-2106: After calling write_msg, explicitly wait until the
first record is observed before starting the aging sleep: use the existing wait
helper to assert that num_output == 1 (and if the assert fails, call
test_tail_ctx_destroy(ctx) and return) prior to the flb_time_msleep(4000) call
so the purge window only begins after the first record is consumed; place this
check between write_msg(...) and flb_time_msleep(...) to eliminate the
timing-dependent flake.
---
Outside diff comments:
In `@plugins/in_tail/tail_file.c`:
- Around line 447-450: The call to flb_tail_scan_register_ignored_file_size(ctx,
file->name, strlen(file->name), file->offset) uses file->offset (int64_t) while
the API expects a size_t; update the purge path in tail_file.c to ensure
file->offset is non-negative before passing it (e.g. check file->offset >= 0 and
use (size_t)file->offset) or clamp to 0 to avoid sign-to-unsigned conversion,
keeping the existing call order (flb_tail_scan_register_ignored_file_size first,
then flb_tail_scan_register_aged_out_inode, then flb_tail_file_remove).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 70821c4d-6bc9-49f7-bea0-51eb3616eb3f
📒 Files selected for processing (4)
plugins/in_tail/tail_file.cplugins/in_tail/tail_scan_glob.cplugins/in_tail/tail_scan_win32.ctests/runtime/in_tail.c
🚧 Files skipped from review as they are similar to previous changes (2)
- plugins/in_tail/tail_scan_win32.c
- plugins/in_tail/tail_scan_glob.c
5ca8e29 to
fe6a758
Compare
fe6a758 to
1ccd31d
Compare
5c121c4 to
82cd27c
Compare
82cd27c to
27846b1
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 27846b1d85
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| flb_tail_scan_register_ignored_file_size(ctx, | ||
| file->name, | ||
| strlen(file->name), | ||
| file->offset); |
There was a problem hiding this comment.
Clamp preserved offsets after in-place truncation
When a monitored file ages out and is later copy-truncated or rewritten in place before being appended again, this stores the old offset even though the same inode can now have a smaller st_size. On re-pickup without a DB, set_file_position() seeks to that stale offset past EOF and pending_bytes stays non-positive, so the fresh content is skipped until the file grows past the old size; discard or clamp the saved offset when the refreshed same-inode file is smaller than the preserved offset.
Useful? React with 👍 / 👎.
27846b1 to
edf79be
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: edf79beb06
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| flb_tail_scan_register_ignored_file_size(ctx, | ||
| file->name, | ||
| strlen(file->name), | ||
| file->offset); |
There was a problem hiding this comment.
Seek to saved offset when re-picking aged files
When read_newly_discovered_files_from_head is left at its default true, in_tail_init() flips ctx->read_from_head to true after startup, and set_file_position() then returns before seeking whenever ctx->read_from_head == FLB_TRUE. In that common configuration, this newly saved offset is not applied when the refreshed aged-out file is appended again, so the fd remains at 0 while file->offset is nonzero; the next static read replays old bytes instead of only the appended content. The new test misses this because it explicitly sets read_newly_discovered_files_from_head=false.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 588b0d6cb8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (file->offset > 0) { | ||
| flb_tail_scan_register_ignored_file_size(ctx, | ||
| file->name, | ||
| strlen(file->name), | ||
| file->offset); |
There was a problem hiding this comment.
Preserve the last parsed offset when aging out
When an active file has bytes buffered that have been read but not emitted yet (for example a partial line without a trailing newline), file->offset already points past those bytes while file->stream_offset/the resumable marker still points to the last parsed record. Saving file->offset here means the later re-pickup seeks past the buffered fragment once the file's mtime is refreshed, so appending the newline or rest of that record drops the beginning of the log line. Store the resumable/processed offset instead of the raw read offset for this re-registration path.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d6d80da657
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (file->offset > 0) { | ||
| flb_tail_scan_register_ignored_file_size(ctx, | ||
| file->name, | ||
| strlen(file->name), | ||
| flb_tail_file_db_offset(file)); |
There was a problem hiding this comment.
Preserve zero offsets when re-picking aged-out files
When ignore_active_older_files purges a monitored file whose current resumable offset is 0, this branch does not store any offset. In a tail-from-end configuration (read_newly_discovered_files_from_head=false), an empty file can age out while monitored, then receive its first line; the refresh scan re-adds it with no saved offset, so set_file_position() seeks to EOF and drops that first post-refresh line. The same happens when flb_tail_file_db_offset(file) is 0 because the whole buffered fragment is unparsed, so the line that completes it after re-pickup is skipped instead of reconstructed.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d171a2efe0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| /* A pre-set offset (e.g. from an aged-out re-pickup) must be honoured | ||
| * even when read_from_head is true; the flag only governs truly new | ||
| * files that have no prior read position. */ | ||
| if (file->offset > 0) { |
There was a problem hiding this comment.
Honor saved zero offsets on re-pickup
When read_newly_discovered_files_from_head=false, an empty (or otherwise unconsumed) file can age out with a saved offset of 0; after a later write refreshes mtime, flb_tail_file_append() receives that saved 0, but this check treats it the same as “no saved offset” and falls through to the SEEK_END path, so the new first record is skipped. The new ignored_file_sizes +1 encoding makes zero offsets fetchable, but set_file_position() still needs a way to distinguish an explicit saved zero from a truly new file.
Useful? React with 👍 / 👎.
…files When a file is aged out by ignore_active_older_files, register its inode and read offset so the scan can resume from the saved position if the file's mtime is later refreshed by new writes. On re-scan, re-evaluate the mtime of same-inode aged-out files: only keep skipping them if they are still too old. Clear the aged-out inode entry (and any stale saved offset) when the inode at the path has changed, so a replacement file is not mistakenly skipped. Save flb_tail_file_db_offset() rather than file->offset when aging out, so any buffered but unparsed bytes are not skipped on re-pickup. Store size+1 in ignored_file_sizes and subtract 1 on fetch so that an offset of zero is distinguishable from a missing entry, allowing files aged out at offset 0 to be re-read from the start on re-pickup. Discard the saved offset on re-pickup if it exceeds the current file size, which can happen after an in-place copy-truncate rotation with the same inode. In set_file_position, add an explicit_offset parameter and honour it before consulting the read_from_head flag. The flag governs truly new files only and must not suppress seeking to a known position on aged-out re-pickup, including an explicit offset of zero. Signed-off-by: zshuang0316 <zshuang0316@users.noreply.github.com> Signed-off-by: zshuang0316 <zshuang0316@163.com>
Add two runtime tests that verify a file aged out by ignore_active_older_files is re-added to monitoring when new content updates its mtime, and that reading resumes from the saved offset: - reread_on_update: uses read_newly_discovered_files_from_head=false - reread_on_update_default_read_from_head: uses the default (true), verifying set_file_position honours the saved offset even when ctx->read_from_head is true Signed-off-by: zshuang0316 <zshuang0316@users.noreply.github.com> Signed-off-by: zshuang0316 <zshuang0316@163.com>
Fix a bug where
ignore_active_older_filespermanently excluded fileswhose mtime was refreshed by a new write.
When a file's mtime exceeded
ignore_olderwhile being activelymonitored, its inode was registered as aged-out and the file was removed
from monitoring. If new content was later appended (updating mtime), the
scan still excluded the file because the inode comparison matched without
re-checking the mtime.
Re-evaluate mtime in the aged-out inode branch; only skip if the file is
still too old, otherwise unregister the aged-out entry and allow
re-pickup. Applied to both
tail_scan_glob.c(Linux/macOS) andtail_scan_win32.c(Windows).Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
Bug Fixes
Tests