bundle/direct: record resource operations with DMS#5387
Conversation
|
Commit: 465bc4a |
2fb2f59 to
2f36b2d
Compare
2f36b2d to
d5f35f3
Compare
When experimental.record_deployment_history is set, each successfully applied resource operation is reported to the deployment metadata service (DMS) via the CreateOperation API, in addition to the local state file. - opRecorder interface + sdkRecorder: record(action, id, state) maps a deployplan action to its SDK operation type and posts it under the deployment version. The serialized config is carried in Operation.State so DMS can serve it as resource state; it is omitted for deletes. - deployActionToSDK rejects non-mutating actions (Skip/Undefined) rather than coercing them, so an unmapped action surfaces as an error. - DeploymentBundle.OpRec is nil unless recording is enabled; the apply loop records after each create/update/recreate/resize/update_id/delete. - A failed recording fails the deploy, matching how local state writes (SaveState) are treated: a state-backend write must not be silently lost. The deployment ID and version that identify the DMS version are sourced from the CreateVersion flow, which lands separately; until then they are placeholders (record_deployment_history is experimental and off by default). Co-authored-by: Shreyas Goenka <shreyas.goenka@databricks.com>
d5f35f3 to
465bc4a
Compare
|
|
||
| // Deployment metadata service (DMS): record resource operations. The parent | ||
| // is "deployments/{id}/versions/{n}", so the path tail is captured wholesale. | ||
| server.Handle("POST", "/api/2.0/bundle/{path...}", func(req Request) any { |
There was a problem hiding this comment.
we'll implement this proper in a followup PR when we test the full read - write combined flows for DMS.
| // Record the operation with DMS. Migration only mirrors existing | ||
| // state into the local store; there is no operation to report. | ||
| if !migrateMode { | ||
| if err := b.recordOperation(ctx, resourceKey, action, b.StateDB.GetResourceID(resourceKey), sv.Value); err != nil { |
There was a problem hiding this comment.
@denik I know we discussed using an async queue here - but lets leave that as an optional followup since it's purely perf and not correctness.
…create
Local acceptance tests (direct engine, RecordRequests) that deploy with
experimental.record_deployment_history enabled and assert the recorded
CreateOperation requests:
- jobs/: create -> update -> delete (resource removed from config). Verifies
action_type CREATE/UPDATE/DELETE, that the serialized state is carried for
create/update and omitted for delete, and resource_key/status.
- recreate/: changing a pipeline's immutable 'storage' field records RECREATE.
Adds a testserver handler for POST /api/2.0/bundle/{path...}/operations so the
recorder's CreateOperation calls succeed under the mock server.
Co-authored-by: Shreyas Goenka <shreyas.goenka@databricks.com>
4a6d5fa to
e8b9646
Compare
Renaming a catalog changes its ID, exercising the UpdateWithID path. The recorded operation asserts action_type OPERATION_ACTION_TYPE_UPDATE_WITH_ID and that resource_id reflects the new ID after the rename. Co-authored-by: Shreyas Goenka <shreyas.goenka@databricks.com>
| if b.Config.Experimental != nil && b.Config.Experimental.RecordDeploymentHistory { | ||
| // TODO(DMS): source the deployment ID and version from the DMS | ||
| // CreateVersion response once the deployment-version flow is wired in. | ||
| b.DeploymentBundle.OpRec = direct.NewOperationRecorder(b.WorkspaceClient(ctx).Bundle, "", 0) |
There was a problem hiding this comment.
will be a followup once the read path for DMS state is implemented.
…recording Use "abcd" instead of "" so the recorded parent is the well-formed deployments/abcd/versions/0 rather than a collapsed double-slash path. Still a placeholder pending the DMS CreateVersion flow. Co-authored-by: Shreyas Goenka <shreyas.goenka@databricks.com>
Changing num_workers on a running cluster (lifecycle.started) triggers the Resize path. Asserts action_type OPERATION_ACTION_TYPE_RESIZE. Completes coverage of all mutating actions: create, update, update_id, recreate, resize, delete. Co-authored-by: Shreyas Goenka <shreyas.goenka@databricks.com>
Approval status: pending
|
Summary
When
experimental.record_deployment_historyis set, each successfully applied resource operation in the direct engine is reported to the deployment metadata service (DMS) viaCreateOperation, in addition to the local state file.What changed
bundle/direct/oprecorder.go—opRecorderinterface +sdkRecorder:deployplanaction to its SDKOperationActionType. Non-mutating actions (Skip/Undefined) are rejected with an error rather than silently coerced.Operation.Stateso DMS can serve it as resource state; it is omitted for deletes (per the SDK contract).deployments/{deployment_id}/versions/{version}.bundle/direct/pkg.go—DeploymentBundle.OpRec opRecorder, nil unless recording is enabled.bundle/direct/bundle_apply.go— records after each successfulcreate/update/recreate/resize/update_id/delete. Delete reads the resource ID beforeDestroyremoves it from state. Migration mode records nothing.bundle/phases/deploy.go— constructs the recorder indeployCorewhenb.Config.Experimental.RecordDeploymentHistory.libs/testserver— handler forPOST /api/2.0/bundle/{path...}/operationsso recorded calls succeed under the mock server.Acceptance tests
acceptance/bundle/deploy/record-deployment-history/(direct engine,RecordRequests):action_typeCREATE/UPDATE/DELETE, that serializedstateis carried for create/update and omitted for delete, plusresource_key/status.storagerecords RECREATE.Design notes
CreateOperationfails the deploy, matching how localSaveStatefailures are treated — a state-backend write must not be silently lost.CreateVersionflow, which lands separately. Until then they are placeholders;record_deployment_historyis experimental and off by default.Test plan
go test ./bundle/direct/... ./bundle/phases/... ./libs/testserver/...greengo test ./acceptance -run TestAccept/bundle/deploy/record-deployment-historygreengo vet+./task lint-qclean