CHASM: Sync structure after every pure task#8903
Conversation
| // Node get deleted when previous pure tasks of the same component are executed. | ||
| // e.g. via a (parent) pointer. |
There was a problem hiding this comment.
I don't quite follow here how this involves the parent pointer, wouldn't the node only be deleted if some pure task deleted the node's component?
There was a problem hiding this comment.
Without a pointer, I don't think a task executor is able to delete the node itself.
The task executor is given the Component and we don't have any api to delete a component just by it's value (that will be something like chasmContext.Remove(component) probably).
The only way I am aware of for removing a component is parentComponent.Component = NewEmptyField() which needs a parent pointer, or a pointer to an ancestor node
fa35df9 to
0916999
Compare
| // Processed task should become invalid and will be removed upon CloseTransaction(). | ||
|
|
||
| // TODO: Add a validation for that and return an internal error if tasks is still valid after processing. | ||
| // Alternatively, remove task from PureTasks slice after processing, but that requires persisting the |
There was a problem hiding this comment.
I realized that I can't simply remove thing from PureTasks here since the change may not be persisted. I've reverted that for now.
| return err | ||
| } | ||
|
|
||
| cleanedUp, err := node.closeTransactionCleanupInvalidTasks(taskValidationContext) |
There was a problem hiding this comment.
Since we are not removing invalid tasks in EachPureTask(), I think this change is necessary (though technically this can be done in a separate PR but included here as I am fixing pure task processing anyway).
Basically we need to perform task validation for all tasks in the tree to make sure persisted tasks are always valid.
Here's what can happen if we have a invalid persisted task:
Say at T = 0 we have a component C, which has a pure task T with the smallest scheduled timestamp across the tree. A physical pure task is generated for it, and it's taskStatus is set to created.
Later at T = 1, some other node get updated (e.g. its parent P), and somehow invalidated the pure task (e.g. due to access rule). Without the change here, since C is not updated, the pure task T won't be deleted when P is updated.
At T = 2, the physical pure task for T is executed, it sees T is invalid, so doesn't do anything. Now as long as C is not updated, the T with taskStatus = create will always be there. Also it's T has the smallest timestamp, it will prevent any new physical pure task from being generated. Refresh task also doesn't help much here as the invalid T is still there.
With this change the pure task will be deleted at T = 1.
|
I will go ahead and land this first to catch the release. Let me know if there's any comments and I will address them in follow up PRs. |
What changed?
Why?
CHASM: Sync tree structure when processing pure tasks #8400 fixed the bug where a pure task deletes another node with pure tasks. However, a node can also have multiple pure tasks and a pure task can delete the node itself via a pointer, in this case remaining pure tasks should not be executed. To fix that, we will need to perform tree structure sync after every pure task processing and be able to tell after syncing if a node/component still exists or not.
With the value -> Node map we can easily tell if a component still exists or not. It can also simplify some existing implementations where we scan the entire tree to find the node for a Component: e.g. Ref(), component/DataValuePath().
How did you test it?