Skip to content

_deform_mesh: bump _mesh_version + clear runner nav caches#191

Open
lmoresi wants to merge 1 commit into
developmentfrom
bugfix/deform-cache-invalidation
Open

_deform_mesh: bump _mesh_version + clear runner nav caches#191
lmoresi wants to merge 1 commit into
developmentfrom
bugfix/deform-cache-invalidation

Conversation

@lmoresi
Copy link
Copy Markdown
Member

@lmoresi lmoresi commented May 15, 2026

Summary

Two cache-invalidation gaps in Mesh._deform_mesh, both triggered by direct _deform_mesh(coords) calls (which bypass the mesh.X.coords NDArray callback that normally performs this hygiene — e.g. every free-surface RK stage in the convection benchmark):

  • _mesh_version not bumped. PR Consolidate and unify cached spatial indexing (KDTree) #182 introduced version-keyed kdtree navigation caches (_BaseMeshVariable._get_kdtree, Mesh._get_domain_kdtree) gated on _mesh_version. The mesh.X.coords callback bumps it; direct _deform_mesh() did not — so navigation kdtrees stayed frozen on the undeformed mesh. PR Invalidate evaluate/DMInterp/topology caches on Mesh._deform_mesh #188 added _topology_version invalidation here but missed _mesh_version.
  • Runner coord-identity nav caches not cleared. A runner's restore_points_to_domain caches a kdtree keyed on id(mesh.X.coords). _deform_mesh replaces self._coords, but CPython reuses freed ids → a fresh array can collide with the old id() and the staleness check false-negatives. Explicitly clearing _restore_kdt / _restore_coords_id defeats the id()-reuse hazard.

Brings _deform_mesh into line with the cache hygiene mesh.adapt() and _legacy_access already perform.

Test plan

  • _mesh_version increments on a direct _deform_mesh() call (was frozen at 0 before the fix)
  • Existing mesh/smoother test suites pass
  • Parallel smoke (np=2) of a deforming-mesh case

Note: this is an independent correctness fix. It does not by itself resolve the separate free-surface convection feedback regression under investigation (verified — both fixes present in a clean build still reproduce the damped regime).

Underworld development team with AI support from Claude Code

Two cache-invalidation gaps in Mesh._deform_mesh, both exposed by
direct _deform_mesh(coords) calls (e.g. every free-surface RK
stage in the convection benchmark), which bypass the
mesh.X.coords NDArray callback that normally performs this
hygiene:

1. _mesh_version was not incremented. PR #182 introduced
   version-keyed kdtree navigation caches
   (_BaseMeshVariable._get_kdtree, Mesh._get_domain_kdtree) that
   gate their rebuild on _mesh_version. The mesh.X.coords callback
   bumps it; direct _deform_mesh() did not. Result: navigation
   kdtrees stay frozen on the undeformed mesh, so spatial lookups
   return pre-deform DOFs after the geometry has moved. PR #188
   added _topology_version invalidation here but missed
   _mesh_version.

2. User-installed coord-identity nav caches were not cleared. A
   runner's restore_points_to_domain typically caches a kdtree
   keyed on id(mesh.X.coords). _deform_mesh replaces self._coords
   with a new object, but CPython reuses freed ids, so a fresh
   coords array can collide with the old id() and the staleness
   check false-negatives. Explicitly clearing _restore_kdt /
   _restore_coords_id defeats the id()-reuse hazard.

Verified: _mesh_version now increments on a direct
_deform_mesh() call (was frozen at 0 before). Matches the cache
hygiene mesh.adapt() and _legacy_access already perform; brings
_deform_mesh into line. Independent correctness fix; does not by
itself resolve the separate free-surface convection feedback
regression under investigation.

Underworld development team with AI support from Claude Code
Copilot AI review requested due to automatic review settings May 15, 2026 22:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR closes two cache invalidation gaps in Mesh._deform_mesh() that occur when _deform_mesh(coords) is called directly (bypassing the mesh.X.coords NDArray callback), ensuring version-gated navigation caches and runner-installed navigation helpers don’t remain stale after mesh deformation.

Changes:

  • Increment self._mesh_version inside _deform_mesh() so version-keyed KDTree navigation caches rebuild on geometry updates.
  • Clear runner/user-installed navigation cache attributes (_restore_kdt, _restore_coords_id) to avoid stale reuse due to CPython id() reuse.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1847 to +1860
# Bump the geometry-version counter so version-keyed
# kdtree navigation caches rebuild against the new DOF
# positions: _BaseMeshVariable._get_kdtree and
# Mesh._get_domain_kdtree both gate their rebuild on
# `_mesh_version`. PR #182 introduced those version-keyed
# caches; the mesh.X.coords callback path bumps
# _mesh_version, but direct _deform_mesh() calls (every
# free-surface RK stage) bypass that callback. Without
# this bump the navigation kdtrees stay frozen on the
# undeformed mesh — back-advected SL samples land at the
# wrong DOFs, corrupting the temperature field. PR #188
# added _topology_version invalidation but missed this.
self._mesh_version += 1
# Also nuke any *user-installed* navigation caches that
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants