Add deprecations merge command to combine parallel CI shards#177
Add deprecations merge command to combine parallel CI shards#177
Conversation
Introduces ShardMerger class that deep-merges shard files written by parallel CI nodes into a single canonical shitlist file. Supports --delete-shards, --next, and --path flags. Fixes the inject(:merge) bug from PR #83 where overlapping buckets lost data.
|
If you want to test locally, here you have some fake deprecation warnings shitlist.zip |
- Remove --path flag (unrelated to merge, belongs in its own PR) - Move delete_shards to ShardMerger initializer - Change "test files" to "buckets" in merge output - Add error handling for file read and JSON parse errors - Fix delegation test to verify ShardMerger receives correct params - Use .json extension in tempfiles for realistic shard_glob testing - Add spec for invalid JSON error handling
The behavior is fully covered in shard_merger_spec.rb.
|
|
||
| def shard_glob | ||
| ext = File.extname(base_path) | ||
| "#{base_path.chomp(ext)}.node-*#{ext}" |
There was a problem hiding this comment.
can these files have any extension other than json? I understand this gem generates json files and if they are anything other than json this would fail to parse them, just thinking this code gives the impression that other extensions might be supported, not sure if that's a real usecase
| total_messages = result.values.map(&:size).reduce(0, :+) | ||
| puts "Merged #{shards} shard files into #{path} (#{result.size} buckets, #{total_messages} deprecation messages)" | ||
| exit 0 | ||
| end |
There was a problem hiding this comment.
Is there a reason this block is here? there's a case statement to handle different mode values in https://github.com/fastruby/next_rails/pull/177/changes#diff-783442139b98c9b57ce8c777eba311b67f52d1e560f4d75db2190ab290ce4d10R121
also line 125 should be updated to mention the merge mode too https://github.com/fastruby/next_rails/pull/177/changes#diff-783442139b98c9b57ce8c777eba311b67f52d1e560f4d75db2190ab290ce4d10R125
| end | ||
|
|
||
| dirname = File.dirname(base_path) | ||
| FileUtils.mkdir_p(dirname) unless File.directory?(dirname) |
There was a problem hiding this comment.
can this ever happen? I mean, if the directory does not exist then there will not be shards to merge either since the code looks for them inside the base_path directory, so nothing to write either (it would write an empty {} json file)
I wonder if it's better instead to print some output saying the are no shards to merge (or even that the directory does not exist) instead, that could help users identify typos for example (instead of having to figure out why they have an empty file we could say "The .... path does not exist" and "No shards found at ...", since they are using the merge mode they probably meant to have shards here)
| path = options[:next] ? "spec/support/deprecation_warning.next.shitlist.json" : "spec/support/deprecation_warning.shitlist.json" | ||
|
|
||
| if options[:mode] == "merge" | ||
| require_relative "../lib/deprecation_tracker/shard_merger" |
There was a problem hiding this comment.
is there a reason to require this file inside the if block? looks kinda weird, we don't do that anywhere else in the project
I would get it if it was something really expensive to require in terms of loading a bunch of extra stuff that we don't need, but it's a super simple file
Summary
Adds a
deprecations mergeCLI command andDeprecationTracker::ShardMergerclass to combine shard files produced by parallel CI nodes into a single canonical shitlist file.This is the companion to #176 (parallel CI support). When each CI node writes its own shard (e.g.
deprecation_warning.shitlist.node-0.json), a fan-in step is needed to merge them before the compare phase.Changes
DeprecationTracker::ShardMerger— new class inlib/deprecation_tracker/shard_merger.rbwith deep-merge logic (concatenates arrays for overlapping buckets), fixing the data loss bug from Adds support to print info for multiple files. #83 whereinject(:merge)silently dropped warnings when multiple shards shared the same keys.deprecations mergeCLI subcommand — with--delete-shardsto clean up after merging and--nextfor the next Rails shitlist.DeprecationTracker.merge_shards— convenience class method that delegates toShardMerger.spec/shard_merger_spec.rbcovering: multiple shards, overlapping buckets, no shards, single shard, delete/preserve shards, sort order, and invalid JSON.Example CI workflow
Test plan
bundle exec rspec spec/deprecation_tracker_spec.rb spec/shard_merger_spec.rb— 42 examples, 0 failures--delete-shardsremoves shard files, canonical file remains