Support --dry-run in add command by sergiimk · Pull Request #1201 · kamu-data/kamu-cli

sergiimk · 2025-04-16T02:32:15Z

Related to: #900

This branch contains prep work for kamu apply command that would diff the current and desired state of datasets and apply necessary changes.

It's an important step to make large pipelines easier to maintain and a part of IaC efforts.

I approached this problem from envisioning the --dry-run flag as the crux of the problem - a flag that allows you to preview changes.

I decided to implement this flag via complete separation of planning and execution stages in use cases.

To prototype this, I started with adding --dry-run flag for kamu add command:

I moved the ability to add multiple snapshots at once from AddCommand into CreateDatasetFromSnapshotUseCase - thus moving the complex dependency-based sorting into the use case
I added separate prepare() and apply() methods to the use case
prepare() returns a complete plan of what will be done - you can see generated IDs, keys, content and hashes of every metadata block that will be added to new datasets etc
kamu add --dry-run simply dumps this plan as YAML into output

I like this approach, but still have some doubts:

While I really like having a detailed plan outputted for --dry-run - it shows what "will be done", not "what is different", so we may need a separate kamu diff or kamu apply --diff command to show the differences between current and desired states (e.g. diff between readmes, or SQL queries, or schemas)

Kubernetes API essentially operates on state and diffs - you apply the manifest as the target state of the resource. So their --dry-run will show only whether some resource is created or updated.

The way I implemented --dry-run here essentially shows a state transition plan that will be done within one transaction ... which is much more powerful. But because of Kubernetes async operators model - k8s never knows upfront how operators will act on diffs and can't plan them ahead - it may take multiple operators many steps and a long time to reconcile the current and desired states.

So I wonder if we will run into issues with --dry-run for more complex state transitions.

An alternative approach could be:

We focus on state diffing part for nice UX
We implement --dry-run simply as a rollback of transaction - i.e. execution prints what it usually prints, but progress is un-done at the very end.

Support --dry-run in add command

594b759

sergiimk force-pushed the feature/apply-command branch from a6b8510 to 594b759 Compare April 17, 2025 23:06

sergiimk requested review from s373r and zaychenko-sergei April 18, 2025 17:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support --dry-run in add command#1201

Support --dry-run in add command#1201
sergiimk wants to merge 1 commit intomasterfrom
feature/apply-command

sergiimk commented Apr 16, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sergiimk commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sergiimk commented Apr 16, 2025 •

edited

Loading