refactor: use jubilant for main integration tests by imanenami · Pull Request #1507 · canonical/postgresql-operator

imanenami · 2026-03-05T10:44:19Z

Issue

Migration of tests to jubilant. All test files in the tests/integration root folder are migrated. HA & new_relations tests are skipped for now, to see how comfortable we are with this approach...

Solution

Experimented with an Adapter-based approach. Quick and easy. Each test file took 5-10 min. to migrate once the adapters module was mature.

Findings:

The model.block_until method of python-libjuju is hard to adapt, the current implementation tries to leverage jubilant.Juju.wait, but comes with a performance penalty.

Checklist

I have added or updated any relevant documentation.
I have cleaned any remaining cloud resources from my accounts.

codecov · 2026-03-05T10:45:31Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.43%. Comparing base (cda19e7) to head (4193ae9).

Additional details and impacted files

@@           Coverage Diff            @@
##           16/edge    #1507   +/-   ##
========================================
  Coverage    70.43%   70.43%           
========================================
  Files           15       15           
  Lines         4282     4282           
  Branches       694      694           
========================================
  Hits          3016     3016           
  Misses        1057     1057           
  Partials       209      209

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

taurus-forever

Wow, this is massive, but promising.
LGTM from my side, but some questions first.
The decision here is on @marceloneppel

taurus-forever · 2026-03-10T14:15:25Z

Also, I have noticed this backups:Restore failed: Unit cannot restore backup as it was not elected the leader unit yet

It looks like we are restoring backup having two units /0 and /1 ... once /1 is a leader the test failed.
The test must scaledown to one unit before restoring backup... is it a bug in test or in Jubilant/fixtures/scale-down? @imanenami , Tnx!

P.S. I do not recall such error in non-Jubilant tests

marceloneppel · 2026-03-10T17:30:30Z

I suspect the issue @taurus-forever commented about (Unit cannot restore backup as it was not elected the leader unit yet) may be due to a bug in ModelAdapter.block_until (adapters.py:251-257):

def block_until(self, *conditions, timeout=None, wait_period=0.5):
    self._juju.wait(
        lambda status: all(conditions),
        ...
    )

conditions is a tuple of callable objects (due to *conditions packing — see Python docs). The built-in all() checks the truthiness of each element (Python docs), and since function objects are always truthy (Python docs — "By default, an object is considered true")), all(conditions) is always True regardless of what the conditions would actually return.

This would cause juju.wait() to return immediately without ever polling, so the scale-down wait in backup_helpers.py would complete instantly and the restore would run before the second unit is actually removed. The test may still pass sometimes due to incidental latency from Juju API calls and action dispatch giving Patroni enough time to converge, which would explain the flaky behavior.

If that's the case, the fix would be to actually call each condition:

lambda status: all(c() for c in conditions),

imanenami · 2026-03-11T05:22:01Z

@taurus-forever @marceloneppel

Thanks Alex for spotting that, and special thanks to Marcello for finding the root cause, which I also think is the issue. I'll fix that 👍

dimaqq · 2026-03-12T00:25:47Z

+    output = juju.cli("machines")
+
+    for line in output.splitlines():
+        if instance in line:
+            return line.split()[2]


This may be fragile, consider:

> juju machines Machine State Address Inst id Base AZ Message 0 started 10.58.203.143 juju-cb4589-0 ubuntu@25.10 bb Running 1 down pending ubuntu@26.04 no matching image found 2 pending juju-cb4589-2 ubuntu@24.04 bb Running 3 pending juju-cb4589-3 ubuntu@24.04 bb Container started

If the state is weird like "1", the function returns None and not a string.
If the state is weird like "2", the function returns instance id instead of the ip address.

The "instance in line" check relies on instance ids being sufficiently unique and/or your instance appearing first. I mean if you look for "juju-cb4589-2" you could find "juju-cb4589-20" instead.

What we do in jubilant is juju machines --format json and coerce the output into dataclasses.

In this case, though, why not use juju.status()?
The ip addresses are visible there:

In [11]: sum((m.ip_addresses for m in juju.status().machines.values() if m.hostname == "juju-cb4589-3"), []) Out[11]: ['10.58.203.230']

If you need ip addresses; or .dns_name if you just need something to connect to (it's actually an ip address for lxd, dunno about other substrates).

@imanenami ?

Well, I kept the syntax from the ha_tests/helpers.py module here:

postgresql-operator/tests/integration/ha_tests/helpers.py

Lines 642 to 656 in cda19e7

async def instance_ip(ops_test: OpsTest, instance: str) -> str:

"""Translate juju instance name to IP.

Args:

ops_test: pytest ops test helper

instance: The name of the instance

Returns:

The (str) IP address of the instance

"""

_, output, _ = await ops_test.juju("machines")

for line in output.splitlines():

if instance in line:

return line.split()[2]

I think it's ok to keep it this way since we only use that in active|idle state...

In our team, we'd open a separate issue to clean this up later, add a comment with the link in the code and merge this PR in the current state. I'm not sure about your team's rules.

P.S. When you see await ops_test.juju("machines")...

that means the developer had to use the escape hatch (the Juju CLI was not the norm), probably because the juju-rpc-based API didn't provide the field that was needed. It's a good hint that this code needs to be refactored, because the Juju CLI (warpped by Jubilant) is the norm.

I would check these cop-outs carefully and refactor.

dimaqq · 2026-03-12T00:30:58Z

At a high level

It's an interesting take. I enjoyed reading the PR.

I think it could also be useful, if the idea is to follow up with a series or PRs to dramatically simplify things.

In our experience, Jubilant integration tests should be slightly more concise than pytest-operator-based. I'd expect the extra 2 KLOC this PR adds to eventually get removed.

benhoyt · 2026-03-12T01:18:28Z

This is an interesting idea, and may pay off in the short term, especially if you can reuse adapters.py for multiple charms. However, I'm not a big fan of it, as it's doing a bunch of work only to leave legacy code in place -- now you have to maintain legacy pylibjuju-style code plus the adapters library.

@tonyandrewmeyer has had good success using recent AI models to migrate pytest-operator tests to Jubilant + pytest-jubilant: here is his write-up on Charmhub Discourse. I'd recommend you at least try to reproduce those results to go fully Jubilant-native, without leaving legacy code around. Be sure to use the same approach Tony recommends here, as he tried quite a lot of approaches to get to this result.

imanenami · 2026-03-12T07:52:59Z

Thanks @benhoyt and @dimaqq for your valuable insights.

Personally, I put this method alongside the AI-driven approach (which could be explored on this repo by interested developers) as possible alternatives for jubilant migration.

I could imagine different scenarios where the Adapter approach or AI approach might be the choice. For instance, we have around 15 maintenance-mode charm repos in Kafka alone, which we want them to be testable with Juju 4, but don't want to put a lot of time into migration... This adapter-based approach works best in that scenario. But for actively developed repos, maybe the choice would be different depending on the product manager priorities and developer time available :)

benhoyt · 2026-03-12T22:50:05Z

I could imagine different scenarios where the Adapter approach or AI approach might be the choice. For instance, we have around 15 maintenance-mode charm repos in Kafka alone, which we want them to be testable with Juju 4, but don't want to put a lot of time into migration... This adapter-based approach works best in that scenario. But for actively developed repos, maybe the choice would be different depending on the product manager priorities and developer time available :)

Yep, that's fair, thanks.

imanenami · 2026-03-13T06:26:01Z

@marceloneppel @dragomirp

Sorry for making the reviews stale, I just had to rebase and sign the commits, and did some final, small tidy ups in this commit: 4193ae9

I'd also like to thank all the reviewers who provided valuable input.

No need to mention that the decision to merge this or not is completely on the repo's maintainers, specially @marceloneppel. I have plans to make the migration full and remove pytest-operator dependency altogether in a follow-up PR (The same way I did in this Kafka PR) so I'd highly appreciate if we keep the branch and not delete it ❤️

marceloneppel

Thanks for the work on this, @imanenami! I've re-reviewed after the rebase, and the changes look good. Approving. Looking forward to the follow-up PR to go fully Jubilant-native, like the Kafka one! If the branch gets auto-deleted after the merge, we can restore it.

taurus-forever

Let's give it a try as 1) it works 2) build more experience and confidence in Jubilant 3) can help us merge PG14+PG16 tests.

We are open for AI tests rewrite to native Jubilant. @tonyandrewmeyer can you help us make it success story?

tonyandrewmeyer · 2026-03-18T05:23:34Z

We are open for AI tests rewrite to native Jubilant. @tonyandrewmeyer can you help us make it success story?

Happy to help. I'll take a closer look tomorrow.

github-actions bot added the Libraries: OK The charm libs used are OK and in-sync label Mar 5, 2026

imanenami added the enhancement New feature, UI change, or workload upgrade label Mar 5, 2026

imanenami force-pushed the refactor/16/jubilant branch 6 times, most recently from 5a2b318 to 8ee09c4 Compare March 9, 2026 08:49

imanenami changed the title ~~refactor: use jubilant for backup tests~~ refactor: use jubilant for main integration tests Mar 9, 2026

imanenami force-pushed the refactor/16/jubilant branch 5 times, most recently from c764515 to cce8959 Compare March 9, 2026 15:12

imanenami marked this pull request as ready for review March 10, 2026 09:58

imanenami requested a review from a team as a code owner March 10, 2026 09:58

imanenami requested review from carlcsaposs-canonical, dragomirp, juju-charm-bot, marceloneppel and taurus-forever and removed request for a team March 10, 2026 09:58

taurus-forever reviewed Mar 10, 2026

View reviewed changes

Comment thread tests/integration/jubilant_helpers.py

Comment thread tests/integration/jubilant_helpers.py

Comment thread tests/integration/jubilant_helpers.py Outdated

imanenami force-pushed the refactor/16/jubilant branch from c3cf05b to 1b927b2 Compare March 11, 2026 06:43

imanenami marked this pull request as draft March 11, 2026 07:16

imanenami marked this pull request as ready for review March 11, 2026 12:40

dimaqq reviewed Mar 12, 2026

View reviewed changes

Comment thread tests/integration/jubilant_helpers.py

imanenami added 15 commits March 13, 2026 09:51

refactor: use jubilant for backup tests

48251b6

fix: wait_for_idle

e539561

fix: adapt to action failures

0568c9c

migrate all root tests

112dabd

first batch of fixes

aa6cf6f

replace block_until with juju.wait

b5bdf73

second batch of fixes

e7cf217

tweak test_tls timeouts

b196c3f

tweak test_config timeout

d046edb

fix: lint

c8a1727

partial port of ha helpers

df36320

fix docstrings

747e4e5

fix: ModelAdapter.block_until bug

9ebc9e0

fix: enhance block_until logic

231e9f6

fix: add _update_status to UnitAdapter

a51998f

imanenami dismissed stale reviews from dragomirp and marceloneppel via a51998f March 13, 2026 05:51

imanenami force-pushed the refactor/16/jubilant branch from fb579ea to a51998f Compare March 13, 2026 05:51

fix: resolve FIXMEs and final tidy-up

4193ae9

marceloneppel approved these changes Mar 13, 2026

View reviewed changes

dragomirp approved these changes Mar 13, 2026

View reviewed changes

taurus-forever approved these changes Mar 17, 2026

View reviewed changes

	async def instance_ip(ops_test: OpsTest, instance: str) -> str:
	"""Translate juju instance name to IP.

	Args:
	ops_test: pytest ops test helper
	instance: The name of the instance

	Returns:
	The (str) IP address of the instance
	"""
	_, output, _ = await ops_test.juju("machines")

	for line in output.splitlines():
	if instance in line:
	return line.split()[2]

Conversation

imanenami commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue

Solution

Findings:

Checklist

Uh oh!

codecov bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

taurus-forever left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

taurus-forever commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marceloneppel commented Mar 10, 2026

Uh oh!

imanenami commented Mar 11, 2026

Uh oh!

dimaqq Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

taurus-forever Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

imanenami Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

dimaqq Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

dimaqq Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dimaqq commented Mar 12, 2026

At a high level

Uh oh!

benhoyt commented Mar 12, 2026

Uh oh!

imanenami commented Mar 12, 2026

Uh oh!

benhoyt commented Mar 12, 2026

Uh oh!

imanenami commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marceloneppel left a comment

Choose a reason for hiding this comment

Uh oh!

taurus-forever left a comment

Choose a reason for hiding this comment

Uh oh!

tonyandrewmeyer commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

imanenami commented Mar 5, 2026 •

edited

Loading

codecov bot commented Mar 5, 2026 •

edited

Loading

taurus-forever commented Mar 10, 2026 •

edited

Loading

imanenami commented Mar 13, 2026 •

edited

Loading