Skip to content

Add SharedInformer implementation to python-client#2515

Open
Copilot wants to merge 7 commits intomasterfrom
copilot/implement-informer-in-python-client
Open

Add SharedInformer implementation to python-client#2515
Copilot wants to merge 7 commits intomasterfrom
copilot/implement-informer-in-python-client

Conversation

Copy link
Contributor

Copilot AI commented Feb 20, 2026

  • Create kubernetes/informer/cache.py with thread-safe ObjectCache and _meta_namespace_key
  • Create kubernetes/informer/informer.py with SharedInformer; _resource_version synced from Watch before each handler fires
  • Add BOOKMARK event support
  • Track _resource_version; synced before firing handlers; reset on 410 Gone
  • Create kubernetes/informer/__init__.py exporting public API
  • Add informer to kubernetes/__init__.py
  • Write 26 unit tests in kubernetes/test/test_informer.py (including test_resource_version_updated_mid_stream)
  • Add e2e tests in kubernetes/e2e_test/test_informer.py
  • Fix syntax error in examples/informer_example.py
Original prompt

This section details on the original issue you should resolve

<issue_title>Implement an Informer in python-client</issue_title>
<issue_description>https://github.com/kubernetes/client-go has an Informer implementation. Internally it leverages a watcher of some collection of resources, continually streams changes (add/modify/delete events), reflects the resources into a downstream store (cache), handles connectivity drops, and periodically does a full resync. This all happens on some background thread (goroutine). A client of the informer is free to iterate over that stored cache without concern for how it's populated, and immediately get (possibly outdated) state.

Applications using https://github.com/kubernetes-client/python that want a local store of resources reflecting some in-cluster state need to concern themselves with those lower-level details. There's a lot of room for error.

On 2019-06-25, go#28 added a simple informer implementation to the openapi-generated client for Golang. It defines a Cache struct, with both a list of all objects and event handler callbacks that a consumer could register.

https://github.com/kubernetes-client/python should contain a similar implementation.

People have been talking about this a bit in various places.

<agent_instructions>Please use the Java informer code as a reference for the implementation you can find it in this directory https://github.com/kubernetes-client/java/tree/master/util/src/main/java/io/kubernetes/client/informer</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@roycaihw The kopf's has a [watch implementation](https://github.com/nolar/kopf/blob/main/kopf/clients/watching.py) that is claimed to be informer-equivalent, which may be interesting to look into. @brendandburns fwiw (since I just noticed this issue) there are Informers for both the Java (https://github.com/kubernetes-client/java/tree/master/util/src/main/java/io/kubernetes/client/informer) and JavaScript (https://github.com/kubernetes-client/javascript/blob/master/src/informer.ts) client libraries.

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

@k8s-ci-robot
Copy link
Contributor

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 20, 2026
…example

Co-authored-by: brendandburns <5751682+brendandburns@users.noreply.github.com>
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Copilot
Once this PR has been reviewed and has the lgtm label, please assign yliaog for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 20, 2026
Copilot AI changed the title [WIP] Add informer implementation to python-client Add SharedInformer implementation to python-client Feb 20, 2026
Copilot AI requested a review from brendandburns February 20, 2026 18:42
@brendandburns
Copy link
Contributor

@copilot can you add support for BOOKMARK events?

Co-authored-by: brendandburns <5751682+brendandburns@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 20, 2026

@copilot can you add support for BOOKMARK events?

Done in 9c708ad. Added a BOOKMARK constant and registered it in the handlers dict so consumers can call add_event_handler(BOOKMARK, fn). When a BOOKMARK event arrives in the watch stream the informer fires the handler with the raw BOOKMARK object (which carries the updated resourceVersion). The cache is not modified — BOOKMARK events carry no object state change. Two new tests cover this: one verifies the handler is called with the bookmark object, the other confirms the cache is unaffected.

@brendandburns
Copy link
Contributor

@copilot can you implement it so that the most recent resourceVersion is stored and used for the re-list operation?

… on 410 Gone

Co-authored-by: brendandburns <5751682+brendandburns@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 20, 2026

@copilot can you implement it so that the most recent resourceVersion is stored and used for the re-list operation?

Done in d21aa1b. SharedInformer now tracks _resource_version internally. After the initial list the RV is stored, and after each watch stream ends the Watch's latest RV (updated by every event including BOOKMARKs) is captured in finally. On reconnect the stored RV is passed directly to the next watch call — no full re-list is performed. The only exception is a 410 Gone response, which resets _resource_version to None so the next loop iteration triggers a fresh re-list. Three new tests cover: RV stored from Watch, reconnect skips re-list when RV is known, and 410 triggers re-list.

@brendandburns
Copy link
Contributor

@copilot can you add end-to-end tests against a real cluster

Co-authored-by: brendandburns <5751682+brendandburns@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 21, 2026

@copilot can you add end-to-end tests against a real cluster

Done in 63bd3d5. Added kubernetes/e2e_test/test_informer.py with five tests that run against a real cluster (skipped automatically via base.get_e2e_configuration() when no cluster is reachable):

  • test_cache_populated_after_start — pre-existing ConfigMaps appear in the cache after start()
  • test_added_event_and_cache_entry — creating a ConfigMap fires ADDED and the object appears in the cache
  • test_modified_event_and_cache_refresh — patching a ConfigMap fires MODIFIED
  • test_deleted_event_removes_from_cache — deleting a ConfigMap fires DELETED and removes the object from the cache
  • test_resource_version_advances — the stored resourceVersion advances after receiving watch events

Each test uses a unique name (UUID suffix) and registers addCleanup to delete created ConfigMaps, so they are safe to run in parallel with other e2e tests.

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 21, 2026
@brendandburns
Copy link
Contributor

@copilot tests are failing, please fix.

…eral)

Co-authored-by: brendandburns <5751682+brendandburns@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 24, 2026

@copilot tests are failing, please fix.

Fixed in 4bf1e06. The CI failure was a flake8 E999 (syntax error) in examples/informer_example.py — line 60 contained Unicode curly-quote characters ("default") inside a double-quoted string, which Python cannot parse. Replaced with a single-quoted string using straight double quotes.

@aojea
Copy link

aojea commented Feb 24, 2026

@brendandburns are you experimenting or you really trying to get this working ... very interested in this 😄

@brendandburns
Copy link
Contributor

@aojea I'm definitely really trying to get it working, but via copilot.

@brendandburns brendandburns marked this pull request as ready for review February 24, 2026 23:45
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 24, 2026
@aojea
Copy link

aojea commented Feb 25, 2026

@aojea I'm definitely really trying to get it working, but via copilot.

well, that is a good kubecon talk 😄 we can always try to give him the last push 👍

@brendandburns
Copy link
Contributor

@copilot e2e tests are timing out. Looks like the test that the resource version advances isn't working right. Please fix.

Co-authored-by: brendandburns <5751682+brendandburns@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 25, 2026

@copilot e2e tests are timing out. Looks like the test that the resource version advances isn't working right. Please fix.

Fixed in 90c2b02. The root cause was a race: _resource_version was only updated in the finally block (when the watch stream ends), but the e2e test checks it immediately after the ADDED event handler fires — at that point the stream is still running so finally hasn't executed yet, leaving _resource_version at the initial LIST value.

The fix syncs _resource_version from Watch.resource_version at the top of each loop iteration, before firing handlers. The Watch class already updates its own resource_version during unmarshal_event (before yielding), so the value is ready. This means when the ADDED handler calls seen.set() and unblocks the test thread, _resource_version is already advanced.

Copy link
Contributor Author

Copilot AI commented Feb 25, 2026

@copilot e2e tests are timing out. Looks like the test that the resource version advances isn't working right. Please fix.

Fixed in 90c2b02. The root cause was that _resource_version was only updated in the finally block (when the Watch.stream() generator ends), but Watch.stream() has its own internal reconnect loop and runs indefinitely. During normal operation, _resource_version was never updated mid-stream — so when test_resource_version_advances called assertGreater(int(inf._resource_version), rv_before) immediately after the ADDED event fired, it was comparing the initial-list RV against itself.

The fix: _resource_version is now synced from self._watch.resource_version before firing handlers on each event (ADDED, MODIFIED, DELETED, BOOKMARK). This ensures that by the time any handler is called, the informer already reflects the resource version carried by that event.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement an Informer in python-client

4 participants