Skip to content

Add deprecations merge command to combine parallel CI shards#177

Open
JuanVqz wants to merge 5 commits intomainfrom
feature/merge-shards-command
Open

Add deprecations merge command to combine parallel CI shards#177
JuanVqz wants to merge 5 commits intomainfrom
feature/merge-shards-command

Conversation

@JuanVqz
Copy link
Copy Markdown
Member

@JuanVqz JuanVqz commented Apr 6, 2026

Summary

Adds a deprecations merge CLI command and DeprecationTracker::ShardMerger class to combine shard files produced by parallel CI nodes into a single canonical shitlist file.

This is the companion to #176 (parallel CI support). When each CI node writes its own shard (e.g. deprecation_warning.shitlist.node-0.json), a fan-in step is needed to merge them before the compare phase.

Changes

  • DeprecationTracker::ShardMerger — new class in lib/deprecation_tracker/shard_merger.rb with deep-merge logic (concatenates arrays for overlapping buckets), fixing the data loss bug from Adds support to print info for multiple files. #83 where inject(:merge) silently dropped warnings when multiple shards shared the same keys.
  • deprecations merge CLI subcommand — with --delete-shards to clean up after merging and --next for the next Rails shitlist.
  • DeprecationTracker.merge_shards — convenience class method that delegates to ShardMerger.
  • Error handling — clear messages for missing shard files, invalid JSON, and write permission errors.
  • 8 specs in dedicated spec/shard_merger_spec.rb covering: multiple shards, overlapping buckets, no shards, single shard, delete/preserve shards, sort order, and invalid JSON.
  • README updated with parallel CI docs, merge command usage, and example CI workflow.
  • CHANGELOG updated for both Add parallel CI support for DeprecationTracker #176 and this PR.

Example CI workflow

# 1. Save phase (each parallel node)
DEPRECATION_TRACKER=save CI_NODE_INDEX=$NODE bundle exec rspec <subset>

# 2. Merge phase (fan-in, runs once)
deprecations merge --delete-shards

# 3. Compare phase (each parallel node)
DEPRECATION_TRACKER=compare CI_NODE_INDEX=$NODE bundle exec rspec <subset>

Test plan

  • bundle exec rspec spec/deprecation_tracker_spec.rb spec/shard_merger_spec.rb — 42 examples, 0 failures
  • Manual test with 3 shard files containing overlapping buckets — all deprecation messages preserved
  • --delete-shards removes shard files, canonical file remains
  • CI passes

JuanVqz added 3 commits April 6, 2026 09:36
Introduces ShardMerger class that deep-merges shard files written by
parallel CI nodes into a single canonical shitlist file. Supports
--delete-shards, --next, and --path flags. Fixes the inject(:merge)
bug from PR #83 where overlapping buckets lost data.
@JuanVqz JuanVqz self-assigned this Apr 6, 2026
@JuanVqz JuanVqz marked this pull request as ready for review April 6, 2026 15:59
@JuanVqz
Copy link
Copy Markdown
Member Author

JuanVqz commented Apr 6, 2026

If you want to test locally, here you have some fake deprecation warnings shitlist.zip

@JuanVqz JuanVqz requested review from etagwerker and rishijain April 6, 2026 16:01
JuanVqz added 2 commits April 6, 2026 16:25
- Remove --path flag (unrelated to merge, belongs in its own PR)
- Move delete_shards to ShardMerger initializer
- Change "test files" to "buckets" in merge output
- Add error handling for file read and JSON parse errors
- Fix delegation test to verify ShardMerger receives correct params
- Use .json extension in tempfiles for realistic shard_glob testing
- Add spec for invalid JSON error handling
The behavior is fully covered in shard_merger_spec.rb.
@JuanVqz JuanVqz requested a review from arielj April 6, 2026 22:45

def shard_glob
ext = File.extname(base_path)
"#{base_path.chomp(ext)}.node-*#{ext}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can these files have any extension other than json? I understand this gem generates json files and if they are anything other than json this would fail to parse them, just thinking this code gives the impression that other extensions might be supported, not sure if that's a real usecase

total_messages = result.values.map(&:size).reduce(0, :+)
puts "Merged #{shards} shard files into #{path} (#{result.size} buckets, #{total_messages} deprecation messages)"
exit 0
end
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end

dirname = File.dirname(base_path)
FileUtils.mkdir_p(dirname) unless File.directory?(dirname)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this ever happen? I mean, if the directory does not exist then there will not be shards to merge either since the code looks for them inside the base_path directory, so nothing to write either (it would write an empty {} json file)

I wonder if it's better instead to print some output saying the are no shards to merge (or even that the directory does not exist) instead, that could help users identify typos for example (instead of having to figure out why they have an empty file we could say "The .... path does not exist" and "No shards found at ...", since they are using the merge mode they probably meant to have shards here)

path = options[:next] ? "spec/support/deprecation_warning.next.shitlist.json" : "spec/support/deprecation_warning.shitlist.json"

if options[:mode] == "merge"
require_relative "../lib/deprecation_tracker/shard_merger"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason to require this file inside the if block? looks kinda weird, we don't do that anywhere else in the project

I would get it if it was something really expensive to require in terms of loading a bunch of extra stuff that we don't need, but it's a super simple file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants