-
Notifications
You must be signed in to change notification settings - Fork 974
Flaky fixes #8778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
rustyrussell
wants to merge
39
commits into
ElementsProject:master
Choose a base branch
from
rustyrussell:guilt/flaky-fixes
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Flaky fixes #8778
rustyrussell
wants to merge
39
commits into
ElementsProject:master
from
rustyrussell:guilt/flaky-fixes
+329
−462
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
11ca55c to
d21a24c
Compare
36b06b0 to
d68a2fc
Compare
…MACHINE. We used to just run these without valgrind, but we already run them in CI (which sets SLOW_MACHINE) without valgrind, so this just doubles up. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is covering up real bugs; we need to fix CI instead. From https://pypi.org/project/pytest-rerunfailures/#re-run-all-failures: ``` To re-run all test failures, use the --reruns command line option with the maximum number of times you’d like the tests to run: $ pytest --reruns 5 ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This shows up as a flake in test_route_by_old_scid:
```
# Now restart l2, make sure it remembers the original!
l2.restart()
> l2.rpc.connect(l1.info['id'], 'localhost', l1.port)
tests/test_splicing.py:554:
...
> raise RpcError(method, payload, resp['error'])
E pyln.client.lightning.RpcError: RPC call failed: method: connect, payload: {'id': '0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518', 'host': 'localhost', 'port': 33837}, error: {'code': 400, 'message': 'Unable to connect, no address known for peer'}
```
This is because it's already (auto)connecting, and fails. This
failure is reported, before we've added the new address (once we add
the new address, we connect fine, but it's too late!):
```
lightningd-2 2025-12-08T02:39:18.241Z DEBUG gossipd: REPLY WIRE_GOSSIPD_NEW_BLOCKHEIGHT_REPLY with 0 fds
lightningd-2 2025-12-08T02:39:18.320Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Initializing important peer with 0 addresses
lightningd-2 2025-12-08T02:39:18.320Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Failed connected out: Unable to connect, no address known for peer
lightningd-2 2025-12-08T02:39:18.344Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Will try reconnect in 1 seconds
lightningd-2 2025-12-08T02:39:18.344Z DEBUG 035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d-connectd: Initializing important peer with 1 addresses
lightningd-2 2025-12-08T02:39:18.344Z DEBUG 035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d-connectd: Connected out, starting crypto
lightningd-2 2025-12-08T02:39:18.344Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Adding 1 addresses to important peer
lightningd-2 2025-12-08T02:39:18.345Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Connected out, starting crypto
{'run_id': 256236335046680576, 'github_repository': 'ElementsProject/lightning', 'github_sha': '555f1ac461d151064aad6fc83b94a0290e2e9d5d', 'github_ref': 'refs/pull/8767/merge', 'github_ref_name': 'HEAD', 'github_run_id': 20013689862, 'github_head_ref': 'fixup-backfill-bug', 'github_run_number': 14774, 'github_base_ref': 'master', 'github_run_attempt': '1', 'testname': 'test_route_by_old_scid', 'start_time': 1765161493, 'end_time': 1765161558, 'outcome': 'fail'}
lightningd-2 2025-12-08T02:39:18.421Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-hsmd: Got WIRE_HSMD_ECDH_REQ
lightningd-2 2025-12-08T02:39:18.421Z DEBUG hsmd: Client: Received message 1 from client
lightningd-2 2025-12-08T02:39:18.453Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-hsmd: Got WIRE_HSMD_ECDH_REQ
lightningd-2 2025-12-08T02:39:18.453Z DEBUG hsmd: Client: Received message 1 from client
--------------------------- Captured stdout teardown ---------------------------
```
We can still get a warning: lightningd-1 2025-12-10T01:11:07.232Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-connectd: Received WIRE_WARNING: WARNING: channel_announcement: no unspent txout 109x1x1 This has nothing to do with l1 talking about the original channel (which would be 103x1x): it's because l2's gossipd (being the node which does the splice) immediately forgets the pre-splice id. If l1 sends some gossip, it will get a warning message. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since this was written, we now test if remote side would get into this situation and stop it from happening, so the test doesn't work any more. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
``` 2025-12-10T02:51:06.2435409Z [gw1] [ 77%] ERROR tests/test_plugin.py::test_commando ...lightningd-1: had BROKEN messages ... 2025-12-10T03:00:26.0440311Z lightningd-1 2025-12-10T02:51:01.548Z UNUSUAL jsonrpc#69: That's weird: Request checkrune took 5961 milliseconds ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
signpsbt could be the one which takes a long time, so allow any psbt event. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
…essages. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We restart the nodeL if the coin_movements.py plugin hasn't processed the notification yet, it will be incorrect: ``` > assert account_balance(l2, chanid_1) == 100001001 E AssertionError: assert 150_001_001msat == 100_001_001 E + where 150001001msat = account_balance(<fixtures.LightningNode object at 0x7f0634e1eb00>, '39ac52c818c5304cf0664940ff236c4e3f8f4ceb8993cb1491347142d61b62bc') ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Not clear why it was disabled, but it never got re-enabled. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We didn't log when anchor transactions had short signatures,
which causes this test to not assert (did_short_sig):
```
total_feerate_perkw = total_fees / total_weight * 1000
> check_feerate([l3, l2], total_feerate_perkw, feerate)
tests/test_closing.py:4063:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
nodes = [<fixtures.LightningNode object at 0x7f0fb322bb20>, <fixtures.LightningNode object at 0x7f0fb1b5ead0>]
actual_feerate = 14006.105538595726, expected_feerate = 14000
def check_feerate(nodes, actual_feerate, expected_feerate):
# Feerate can't be lower.
assert actual_feerate > expected_feerate - 2
if actual_feerate >= expected_feerate + 2:
if any([did_short_sig(n) for n in nodes]):
return
# Use assert as it shows the actual values on failure
> assert actual_feerate < expected_feerate + 2
E AssertionError
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't want it to think that it can use both pre-splice and post-splice channels! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means that we won't complain to peers which gossip about our channels, but it does mean that our channel graph (like other nodes on the network) will show two channels, not one, for the duration. For this reason, we need askrene to omit local dying channels. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We removed the functionality, but only disabled the tests. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
```
> ret = l1.rpc.xpay(invstring=inv, maxfee=maxfee)
tests/test_xpay.py:585:
...
> raise RpcError(method, payload, resp['error'])
E pyln.client.lightning.RpcError: RPC call failed: method: xpay, payload: {'invstring': 'lnbcrt1m1p5n5wdzsp5qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqpp52mu6842a26hs40xxgzscflm4smk5yjctqgf7hhrwhx7dh2492vzsdp22pshj6twvusxummyv5sr2wfqwa5hg6pqd4shsen9v5cqpj9qyyqqqj9kvjvrzy0ygdsfsskjtss0xrkrt7lq8jyrgzvtvdw5y9xqy0v25ddxd787c9ym36udm876lyhevznj8j9qzk0r7x03xm0akvq6ltwcq7vm7tk', 'maxfee': 57966}, error: {'code': 209, 'message': "Failed after 4 attempts. We got temporary_channel_failure for 59x81x28707/1, assuming it can't carry 49498813msat. Then routing for remaining 49498813msat failed: linear_routes: timed out after deadline"}
...
lightningd-1 2025-12-11T03:25:41.972Z DEBUG plugin-cln-askrene: notify msg debug: get_routes failed after 15572 ms
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Under Postgres, this actually takes more than 2 seconds, so w2
really has timed out already:
```
time.sleep(2) # total 2
assert not w1.done()
> assert not w2.done()
E assert not True
E + where True = done()
E + where done = <Future at 0x7fe14e54fee0 state=finished raised RpcError>.done
tests/test_invoices.py:420: AssertionError
```
So space the timeouts out more, and sleep one second too short; the
.result() (which sleeps) will catch up if we were extremely slow.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
3974806 added this: CI: Try not running group 2/10 UBSAN in parallel. It's being killed with signal 143, which means docker isn't happy; too much memory consumption? But since we're now at 12 groups, that probably doesn't apply (it might not have even before, in the two years since that commit since so may things have been added). And it caused this shard to take over 2 hours and timed out. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's not reliable:
```
# We should have deferred hook update at least once!
> l2.daemon.wait_for_log("UNUSUAL plugin-dep_b.py: Deferring registration of hook htlc_accepted until it's not in use.")
tests/test_plugin.py:2646:
...
if self.is_in_log(r):
print("({} was previously in logs!)".format(r))
> raise TimeoutError('Unable to find "{}" in logs.'.format(exs))
E TimeoutError: Unable to find "[re.compile("UNUSUAL plugin-dep_b.py: Deferring registration of hook htlc_accepted until it's not in use.")]" in logs.
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Do this by setting notleak when we do the detach! ``` **BROKEN** lightningd: MEMLEAK: 0x60f0000bbb38 **BROKEN** lightningd: label=ccan/ccan/io/io.c:92:struct io_conn **BROKEN** lightningd: alloc: **BROKEN** lightningd: /home/runner/work/lightning/lightning/ccan/ccan/tal/tal.c:488 (tal_alloc_) **BROKEN** lightningd: /home/runner/work/lightning/lightning/ccan/ccan/io/io.c:92 (io_new_conn_) **BROKEN** lightningd: /home/runner/work/lightning/lightning/lightningd/subd.c:781 (new_subd) **BROKEN** lightningd: /home/runner/work/lightning/lightning/lightningd/subd.c:835 (new_channel_subd_) **BROKEN** lightningd: /home/runner/work/lightning/lightning/lightningd/channel_control.c:1715 (peer_start_channeld) **BROKEN** lightningd: /home/runner/work/lightning/lightning/lightningd/peer_control.c:1390 (connect_activate_subd) **BROKEN** lightningd: /home/runner/work/lightning/lightning/lightningd/peer_control.c:1516 (peer_connected_hook_final) **BROKEN** lightningd: /home/runner/work/lightning/lightning/lightningd/plugin_hook.c:243 (hook_done) **BROKEN** lightningd: /home/runner/work/lightning/lightning/lightningd/plugin_hook.c:343 (plugin_hook_call_next) **BROKEN** lightningd: /home/runner/work/lightning/lightning/lightningd/plugin_hook.c:299 (plugin_hook_callback) **BROKEN** lightningd: /home/runner/work/lightning/lightning/lightningd/plugin.c:701 (plugin_response_handle) **BROKEN** lightningd: /home/runner/work/lightning/lightning/lightningd/plugin.c:790 (plugin_read_json) **BROKEN** lightningd: /home/runner/work/lightning/lightning/ccan/ccan/io/io.c:60 (next_plan) **BROKEN** lightningd: /home/runner/work/lightning/lightning/ccan/ccan/io/io.c:422 (do_plan) **BROKEN** lightningd: /home/runner/work/lightning/lightning/ccan/ccan/io/io.c:439 (io_ready) **BROKEN** lightningd: /home/runner/work/lightning/lightning/ccan/ccan/io/poll.c:470 (io_loop) **BROKEN** lightningd: /home/runner/work/lightning/lightning/lightningd/io_loop_with_timers.c:22 (io_loop_with_timers) **BROKEN** lightningd: /home/runner/work/lightning/lightning/lightningd/lightningd.c:1492 (main) **BROKEN** lightningd: ../sysdeps/nptl/libc_start_call_main.h:58 (__libc_start_call_main) **BROKEN** lightningd: ../csu/libc-start.c:392 (__libc_start_main_impl) **BROKEN** lightningd: parents: **BROKEN** lightningd: lightningd/lightningd.c:108:struct lightningd ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If one second has passed during testing, checkrune might pass:
```
# default (sec)
rune_per_default = l1.rpc.createrune(restrictions=[["per=1"]])['rune']
assert rune_per_default == 'NrM7go6C4qzfRQDkUSv1DtRroJWSKqdjIOuvGS4TLFE9NCZwZXI9MQ=='
> do_test_rune_per_restriction(l1, rune_per_default, 1)
tests/test_runes.py:269:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
l1 = <fixtures.LightningNode object at 0x7f7344ad2ef0>
rune_to_test = 'NrM7go6C4qzfRQDkUSv1DtRroJWSKqdjIOuvGS4TLFE9NCZwZXI9MQ=='
per_sec = 1
def do_test_rune_per_restriction(l1, rune_to_test, per_sec):
...
# cannot use same rune till 'per_sec' seconds
> with pytest.raises(RpcError, match='Not permitted:') as exc_info:
E Failed: DID NOT RAISE <class 'pyln.client.lightning.RpcError'>
tests/test_runes.py:217: Failed
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
``` 2026-01-05T00:11:22.0447771Z # Only up to one should succeed. 2026-01-05T00:11:22.0448201Z success = False 2026-01-05T00:11:22.0448571Z for c in completes: 2026-01-05T00:11:22.0448957Z try: 2026-01-05T00:11:22.0449322Z c.result(TIMEOUT) 2026-01-05T00:11:22.0449934Z num_complete += 1 2026-01-05T00:11:22.0450378Z > assert not success 2026-01-05T00:11:22.0451005Z E assert not True 2026-01-05T00:11:22.0451201Z 2026-01-05T00:11:22.0451331Z tests/test_connection.py:1596: AssertionError ``` We don't know *which* ones succeeded. Fix that. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
cf27ddc to
56f126c
Compare
…p_lost_node
The channel vanishes from listchannels when it's dying, *but* only when it gets deleted
do we consider moving the actual node_announcement. We have to wait until gossipd
has seen the 12 blocks, and move it if necessary.
```
E Full diff:
E {
E 'nodes': [
E - {
E - 'addresses': [],
E - 'alias': 'SILENTARTIST-e986cd2-modded',
E - 'color': '022d22',
E - 'features': '808898880a8a59a1',
E - 'last_timestamp': 1767572731,
E - 'nodeid': '022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59',
E - },
E {
E 'addresses': [],
E 'alias': 'HOPPINGFIRE-e986cd2-modded',
E 'color': '035d2b',
E 'features': '808898880a8a59a1',
E 'last_timestamp': 1767572731,
E 'nodeid': '035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d',
E },
E {
E 'addresses': [],
E 'alias': 'JUNIORFELONY-e986cd2-modded',
E 'color': '0382ce',
E 'features': '808898880a8a59a1',
E 'last_timestamp': 1767572731,
E 'nodeid': '0382ce59ebf18be7d84677c2e35f23294b9992ceca95491fcf8a56c6cb2d9de199',
E },
E {
E 'addresses': [],
E + 'alias': 'SILENTARTIST-e986cd2-modded',
E + 'color': '022d22',
E + 'features': '808898880a8a59a1',
E + 'last_timestamp': 1767572731,
E + 'nodeid': '022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59',
E + },
E + {
E + 'addresses': [],
E 'alias': 'JUNIORBEAM-e986cd2-modded',
E 'color': '0266e4',
E 'features': '808898880a8a59a1',
E 'last_timestamp': 1767572732,
E 'nodeid': '0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518',
E },
E ],
E }
tests/test_gossip.py:2390: AssertionError
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
``` [gw0] [ 24%] PASSED tests/test_misc.py::test_hsm_capabilities tests/test_connection.py::test_funding_cancel_race Error: Process completed with exit code 143. ``` Seems like 100 nodes is too many! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rather than break the API, use total capacity here. ``` Valgrind error file: valgrind-errors.5880 ==5880== Use of uninitialised value of size 8 ==5880== at 0x4A390BB: _itoa_word (_itoa.c:183) ==5880== by 0x4A43C9B: __printf_buffer (vfprintf-process-arg.c:155) ==5880== by 0x4A69D90: vsnprintf (vsnprintf.c:96) ==5880== by 0x1875E6: json_out_addv (json_out.c:239) ==5880== by 0x14471E: json_add_primitive_fmt (json_stream.c:170) ==5880== by 0x144BA2: json_add_u64 (json_stream.c:282) ==5880== by 0x145E33: json_add_amount_msat (json_stream.c:619) ==5880== by 0x11DDE2: channel_hint_to_json (channel_hint.c:33) ==5880== by 0x11FE9F: channel_hint_notify_core (libplugin-pay.c:394) ==5880== by 0x11FF7A: channel_hint_notify (libplugin-pay.c:412) ==5880== by 0x1201EA: channel_hints_update (libplugin-pay.c:455) ==5880== by 0x122DAF: handle_intermediate_failure (libplugin-pay.c:1437) ==5880== ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
``` 2026-01-06T07:46:35.5710043Z lightningd-1 2026-01-06T07:45:11.040Z UNUSUAL jsonrpc#68: That's weird: Request signpsbt took 5099 milliseconds ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This showed up as a flake, where we "got lucky" and the sendpay resolved before waitsendpay was called. Instead, make this race explicit, so we can test it.
```
# FIXME: #define PAY_UNPARSEABLE_ONION 202
PAY_UNPARSEABLE_ONION = 202
> assert err.value.error['code'] == PAY_UNPARSEABLE_ONION
E assert 204 == 202
tests/test_misc.py:2152: AssertionError
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't explicitly save the return code in db, so we need to reconstruct it. We didn't cover the "peer told us our onion was bad" corner case. But it's hardly worth a changelog message, since users will never see this. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It failed, because it got the message before connectd has processed
the updated allow list:
```
lightningd-2 2026-01-06T07:53:02.817Z INFO plugin-allow_even_msgs.py: Killing plugin: stopped by lightningd via RPC
...
lightningd-1 2026-01-06T07:53:02.820Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-lightningd: sendcustommsg id="-c:sendcustommsg#16" sending a custom even message (43690)
...
lightningd-1 2026-01-06T07:53:02.820Z 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-connectd: [OUT] aaaaffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffbb
lightningd-2 2026-01-06T07:53:02.823Z 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: [IN] aaaaffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffbb
...
lightningd-2 2026-01-06T07:53:02.823Z DEBUG connectd: Now allowing 0 custom message types
```
Resulting in:
``
l2.daemon.wait_for_log(r'\[IN\] {}'.format(msg))
> l1.daemon.wait_for_log('Invalid unknown even msg')
tests/test_misc.py:4673:
...
> raise TimeoutError('Unable to find "{}" in logs.'.format(exs))
E TimeoutError: Unable to find "[re.compile('Invalid unknown even msg')]" in logs.
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's a real bug, which I've reported in ElementsProject#8822 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Sometimes it times out under CI, but running 100 times here reveals
nothing. Assume network issues and mark it flaky.
```
node_factory = <pyln.testing.utils.NodeFactory object at 0x7f6e700b8970>
@pytest.mark.slow_test
def test_reckless_uv_install(node_factory):
node = get_reckless_node(node_factory)
node.start()
> r = reckless([f"--network={NETWORK}", "-v", "install", "testpluguv"],
dir=node.lightning_dir)
tests/test_reckless.py:358:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/test_reckless.py:141: in reckless
r = subprocess.run(cmds, capture_output=True, encoding='utf-8', env=my_env,
/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/subprocess.py:505: in run
stdout, stderr = process.communicate(input, timeout=timeout)
/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/subprocess.py:1154: in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/subprocess.py:2022: in _communicate
self._check_timeout(endtime, orig_timeout, stdout, stderr)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <Popen: returncode: -9 args: ['tools/reckless', '-l', '/tmp/ltests-kr4cjtd8/...>
endtime = 4623.246515403, orig_timeout = 60
stdout_seq = [b'[2026-01-07 05:55:15,159] DEBUG: Warning: Reckless requires write access\n[2026-01-07 05:55:15,159] DEBUG: Searching for testpluguv\n', b'[2026-01-07 05:55:15,179] DEBUG: InstInfo(testpluguv, https://github.com/lightningd/plugins, None, None, None, testpluguv), Source.GITHUB_REPO\nfound testpluguv in source: https://github.com/lightningd/plugins\n[2026-01-07 05:55:15,179] DEBUG: entry: None\n[2026-01-07 05:55:15,179] DEBUG: sub-directory: testpluguv\n[2026-01-07 05:55:15,179] DEBUG: Retrieving testpluguv from https://github.com/lightningd/plugins\n[2026-01-07 05:55:15,179] DEBUG: Install requested from InstInfo(testpluguv, https://github.com/lightningd/plugins, None, None, None, testpluguv).\n', b'cloning Source.GITHUB_REPO InstInfo(testpluguv, https://github.com/lightningd/plugins, None, None, None, testpluguv)\n[2026-01-07 05:55:15,405] DEBUG: cloned_src: InstInfo(testpluguv, /tmp/reckless-433081020a3dff932/clone, None, testpluguv.py, uv.lock, testpluguv/testpluguv)\n', b'[2026-01-07 05:55:15,409] DEBUG: using latest commit of default branch\n', b'[2026-01-07 05:55:15,417] DEBUG: checked out HEAD: 095457638f8080cd614a81cb4ad1cba7883549e3\n[2026-01-07 05:55:15,417] DEBUG: using installer pythonuv\n[2026-01-07 05:55:15,417] DEBUG: creating /tmp/ltests-kr4cjtd8/test_reckless_uv_install_1/lightning-1/reckless/testpluguv\n[2026-01-07 05:55:15,418] DEBUG: creating /tmp/ltests-kr4cjtd8/test_reckless_uv_install_1/lightning-1/reckless/testpluguv/source\n[2026-01-07 05:55:15,418] DEBUG: copying /tmp/reckless-433081020a3dff932/clone/testpluguv/testpluguv tree to /tmp/ltests-kr4cjtd8/test_reckless_uv_install_1/lightning-1/reckless/testpluguv/source/testpluguv\n[2026-01-07 05:55:15,419] DEBUG: linking source /tmp/ltests-kr4cjtd8/test_reckless_uv_install_1/lightning-1/reckless/testpluguv/source/testpluguv/testpluguv.py to /tmp/ltests-kr4cjtd8/test_reckless_uv_install_1/lightning-1/reckless/testpluguv/testpluguv.py\n[2026-01-07 05:55:15,419] DEBUG: InstInfo(testpluguv, /tmp/ltests-kr4cjtd8/test_reckless_uv_install_1/lightning-1/reckless/testpluguv, None, testpluguv.py, uv.lock, source/testpluguv)\n']
stderr_seq = [b'config file not found: /tmp/ltests-kr4cjtd8/test_reckless_uv_install_1/lightning-1/liquid-regtest/config\npress [Y] to create one now.\nconfig file not found: /tmp/ltests-kr4cjtd8/test_reckless_uv_install_1/lightning-1/reckless/liquid-regtest-reckless.conf\nconfig file not found: /tmp/ltests-kr4cjtd8/test_reckless_uv_install_1/lightning-1/reckless/.sources\n']
skip_check_and_raise = False
def _check_timeout(self, endtime, orig_timeout, stdout_seq, stderr_seq,
skip_check_and_raise=False):
"""Convenience for checking if a timeout has expired."""
if endtime is None:
return
if skip_check_and_raise or _time() > endtime:
> raise TimeoutExpired(
self.args, orig_timeout,
output=b''.join(stdout_seq) if stdout_seq else None,
stderr=b''.join(stderr_seq) if stderr_seq else None)
E subprocess.TimeoutExpired: Command '['tools/reckless', '-l', '/tmp/ltests-kr4cjtd8/test_reckless_uv_install_1/lightning-1', '--network=liquid-regtest', '-v', 'install', 'testpluguv']' timed out after 60 seconds
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
pay sometimes ignores exclusions. WONTFIX.
```
with pytest.raises(RpcError, match=r'is not reachable directly and all routehints were unusable.'):
> l1.rpc.pay(inv, exclude=[scid12])
tests/test_pay.py:5279:
...
elif "error" in resp:
> raise RpcError(method, payload, resp['error'])
E pyln.client.lightning.RpcError: RPC call failed: method: pay, payload: {'bolt11': 'lnbcrt1230n1p54mma3sp5x7uerjgyg7ws6fnzdwxc7pgpj6j25uhpqp5uvx3fk8dkcqm37m2spp5k02racjc9knux958u5rgtva24jfvxtr5w3t53pfeavn3thmyny0qdq8v3jhxccxqyjw5qcqp9rzjqgkjyd3q5dv6gllh77kygly9c3kfy0d9xwyjyxsq2nq3c83u5vw4jqqqvuqqqqgqqqqqqqqpqqqqqzsqqc9qxpqysgqcmv875mmzcjl8mwxxndy9an6p870ffpdxdtypmgf5gzsydnt2d68n4kjph0rcprye6tfz0ex0c5clgj3zwm8jgd5vs0fdv7hf7dqr8cqdrg3gf', 'exclude': ['103x2x0/1']}, error: {'code': 210, 'message': 'Ran out of routes to try after 6 attempts: see `paystatus`', 'attempts': [{'status': 'failed', 'failreason': 'No path found', 'partid': 0, 'amount_msat': 123000}, {'status': 'pending', 'failreason': 'No path found', 'partid': 1, 'amount_msat': 123000, 'parent_partid': 0}, {'status': 'failed', 'failreason': 'No path found', 'partid': 2, 'amount_msat': 57006, 'parent_partid': 1}, {'status': 'failed', 'failreason': 'No path found', 'partid': 4, 'amount_msat': 57006, 'parent_partid': 2}, {'status': 'failed', 'failreason': 'No path found', 'partid': 3, 'amount_msat': 65994, 'parent_partid': 1}, {'status': 'failed', 'failreason': 'No path found', 'partid': 5, 'amount_msat': 65994, 'parent_partid': 3}]}
```
The logs show that it doesn't exclude the routehint early: in successful runs we get "After filtering routehints we're left with 0 usable hints". Perhaps this is something to do with the timing of our own notifications?
```
2026-01-07T05:51:10.7902502Z lightningd-1 2026-01-07T05:31:29.706Z DEBUG plugin-pay: cmd -c:pay#64/cln:pay#121 partid 0: Received getchaininfo blockcount=108, headercount=108
2026-01-07T05:51:10.7903334Z lightningd-1 2026-01-07T05:31:29.715Z DEBUG plugin-pay: cmd -c:pay#64/cln:pay#121 partid 0: waitblockheight reports syncheight=108
2026-01-07T05:51:10.7904256Z lightningd-1 2026-01-07T05:31:29.734Z DEBUG plugin-pay: cmd -c:pay#64/cln:pay#121 partid 0: Updated a channel hint for 103x2x0/1: enabled true, estimated capacity 978718000msat
2026-01-07T05:51:10.7905355Z lightningd-1 2026-01-07T05:31:29.734Z DEBUG plugin-pay: cmd -c:pay#64/cln:pay#121 partid 0: Updated a channel hint for 7269357x11669990x33910/1: enabled false, estimated capacity UNKNOWN
2026-01-07T05:51:10.7906580Z lightningd-1 2026-01-07T05:31:29.735Z DEBUG plugin-pay: cmd -c:pay#64/cln:pay#121 partid 0: Updated a channel hint for 103x2x0/1: enabled false, estimated capacity UNKNOWN
2026-01-07T05:51:10.7907665Z lightningd-1 2026-01-07T05:31:29.735Z INFO plugin-pay: cmd -c:pay#64/cln:pay#121 partid 0: Payment fee constraint 615msat is below exemption threshold, allowing a maximum fee of 5000msat
2026-01-07T05:51:10.7908845Z lightningd-1 2026-01-07T05:31:29.752Z DEBUG plugin-pay: Received a channel_hint {.scid = 103x2x0/1, .enabled = 1, .estimate = 978718000msat, .capacity = 1000000000msat }
2026-01-07T05:51:10.7909710Z lightningd-1 2026-01-07T05:31:29.754Z INFO plugin-pay: cmd -c:pay#64/cln:pay#121 partid 0: Filtering out 1 routehints
2026-01-07T05:51:10.7910544Z lightningd-1 2026-01-07T05:31:29.779Z DEBUG plugin-pay: cmd -c:pay#64/cln:pay#121 partid 0: Checking hint {.scid=103x2x0/1, .enabled=1, .estimate=978718000msat}
2026-01-07T05:51:10.7911470Z lightningd-1 2026-01-07T05:31:29.780Z DEBUG plugin-pay: cmd -c:pay#64/cln:pay#121 partid 0: After filtering routehints we're left with 1 usable hints
2026-01-07T05:51:10.7912385Z lightningd-1 2026-01-07T05:31:29.780Z DEBUG plugin-pay: cmd -c:pay#64/cln:pay#121 partid 0: Checking hint {.scid=103x2x0/1, .enabled=1, .estimate=978718000msat}
2026-01-07T05:51:10.7913471Z lightningd-1 2026-01-07T05:31:29.780Z DEBUG plugin-pay: cmd -c:pay#64/cln:pay#121 partid 0: Using routehint 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59 (103x1x0) cltv_delta=6
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
…startup. This can happen if gossipd hasn't processed the blocks yet: ``` lightningd-2 2026-01-07T06:05:19.430Z **BROKEN** 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-chan#3: gossipd gave channel_update in CGOSSIP_CHANNEL_ANNOUNCED_DEAD? update=010240d5d1b653118c047218802d8c5d6bda49124fc9e1cb30ceff72e24c44e6a20d0b6b6fbe5465def31a01c8ff49dc171542a64a1a69d5149698f31e1ba4e721c106226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2b73cf188910f00006f0000010000695df63a010200060000000000000000000000010000000a000000003b023380 ``` It does catch up later, so ignore this. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Anchors will have one input from the commitment tx, and at least on
more (in this case, 3 more); we were only checking the first one for
short signatures.
```
total_feerate_perkw = total_fees / total_weight * 1000
> check_feerate([l3, l2], total_feerate_perkw, feerate)
tests/test_closing.py:4064:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
nodes = [<fixtures.LightningNode object at 0x7f3e9a2c74f0>, <fixtures.LightningNode object at 0x7f3e991d5f30>]
actual_feerate = 14006.105538595726, expected_feerate = 14000
def check_feerate(nodes, actual_feerate, expected_feerate):
# Feerate can't be lower.
assert actual_feerate > expected_feerate - 2
if actual_feerate >= expected_feerate + 2:
if any([did_short_sig(n) for n in nodes]):
return
# Use assert as it shows the actual values on failure
> assert actual_feerate < expected_feerate + 2
E AssertionError
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
56f126c to
f06ef77
Compare
We need to make sure anchor reaches bitcoind, otherwise it might mine
the commitment tx without it. This can happen in
test_coinmoves_unilateral_htlc_fulfill as well.
```
> check_chain_moves(l1, expected_chain1)
tests/test_coinmoves.py:844:
...
E Full diff:
E [
E {
E 'account_id': 'wallet',
E 'blockheight': 102,
E 'created_index': 1,
E 'credit_msat': 100000000000,
E 'debit_msat': 0,
E 'extra_tags': [],
E 'output_msat': 100000000000,
E 'primary_tag': 'deposit',
E 'utxo': 'fca99b85e58f8ae23e5c6872e0500784997deb98bfc92e43449206553a108db2:1',
E },
E {
E 'account_id': 'wallet',
E 'blockheight': 103,
E 'created_index': 2,
E 'credit_msat': 0,
E 'debit_msat': 100000000000,
E 'extra_tags': [],
E 'output_msat': 100000000000,
E 'primary_tag': 'withdrawal',
E 'spending_txid': 'c097ad8bde478396c961369b69c50a144fae3423f36af4554f3fb1dacfdff83f',
E 'utxo': 'fca99b85e58f8ae23e5c6872e0500784997deb98bfc92e43449206553a108db2:1',
E },
E {
E 'account_id': 'wallet',
E 'blockheight': 103,
E 'created_index': 3,
E 'credit_msat': 25000000,
E 'debit_msat': 0,
E 'extra_tags': [],
E 'output_msat': 25000000,
E 'primary_tag': 'deposit',
E 'utxo': 'c097ad8bde478396c961369b69c50a144fae3423f36af4554f3fb1dacfdff83f:1',
E },
E {
E 'account_id': '3ff8dfcfdab13f4f55f46af32334ae4f140ac5699b3661c9968347de8bad97c0',
E 'blockheight': 103,
E 'created_index': 4,
E 'credit_msat': 99970073000,
E 'debit_msat': 0,
E 'extra_tags': [
E 'opener',
E ],
E 'output_msat': 99970073000,
E 'peer_id': '022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59',
E 'primary_tag': 'channel_open',
E 'utxo': 'c097ad8bde478396c961369b69c50a144fae3423f36af4554f3fb1dacfdff83f:0',
E },
E {
E - 'account_id': 'wallet',
E + 'account_id': '3ff8dfcfdab13f4f55f46af32334ae4f140ac5699b3661c9968347de8bad97c0',
E 'blockheight': 104,
E 'created_index': 5,
E - 'credit_msat': 0,
E - 'debit_msat': 25000000,
E - 'extra_tags': [],
E - 'output_msat': 25000000,
E - 'primary_tag': 'withdrawal',
E - 'spending_txid': '1b6fbf9887d6f9cce727fc8bf9f582a3353be682998d4bafc9691c9ed26897e7',
E - 'utxo': 'c097ad8bde478396c961369b69c50a144fae3423f36af4554f3fb1dacfdff83f:1',
E - },
E - {
E - 'account_id': 'wallet',
E - 'blockheight': 104,
E - 'created_index': 6,
E - 'credit_msat': Decimal('15579000.00000000'),
E - 'debit_msat': 0,
E - 'extra_tags': [],
E - 'output_msat': Decimal('15579000.00000000'),
E - 'primary_tag': 'deposit',
E - 'utxo': '1b6fbf9887d6f9cce727fc8bf9f582a3353be682998d4bafc9691c9ed26897e7:0',
E - },
E - {
E - 'account_id': '3ff8dfcfdab13f4f55f46af32334ae4f140ac5699b3661c9968347de8bad97c0',
E - 'blockheight': 104,
E - 'created_index': 7,
E 'credit_msat': 0,
E 'debit_msat': 49970073000,
E 'extra_tags': [],
E 'output_count': 5,
E 'output_msat': 99970073000,
E 'primary_tag': 'channel_close',
E 'spending_txid': 'a499419bfdce179727cffca45429151db47839b247d83f71837429f021ae6322',
E 'utxo': 'c097ad8bde478396c961369b69c50a144fae3423f36af4554f3fb1dacfdff83f:0',
E },
E {
E 'account_id': 'external',
E 'blockheight': 104,
E - 'created_index': 8,
E ? ^
E + 'created_index': 6,
E ? ^
E 'credit_msat': 330000,
E 'debit_msat': 0,
E 'extra_tags': [],
E 'originating_account': '3ff8dfcfdab13f4f55f46af32334ae4f140ac5699b3661c9968347de8bad97c0',
E 'output_msat': 330000,
E 'primary_tag': 'anchor',
E 'utxo': 'a499419bfdce179727cffca45429151db47839b247d83f71837429f021ae6322:0',
E },
E {
E 'account_id': 'external',
E 'blockheight': 104,
E - 'created_index': 9,
E ? ^
E + 'created_index': 7,
E ? ^
E 'credit_msat': 330000,
E 'debit_msat': 0,
E 'extra_tags': [],
E 'originating_account': '3ff8dfcfdab13f4f55f46af32334ae4f140ac5699b3661c9968347de8bad97c0',
E 'output_msat': 330000,
E 'primary_tag': 'anchor',
E 'utxo': 'a499419bfdce179727cffca45429151db47839b247d83f71837429f021ae6322:1',
E },
E {
E 'account_id': 'external',
E 'blockheight': 104,
E - 'created_index': 10,
E ? ^^
E + 'created_index': 8,
E ? ^
E 'credit_msat': 50000000000,
E 'debit_msat': 0,
E 'extra_tags': [],
E 'originating_account': '3ff8dfcfdab13f4f55f46af32334ae4f140ac5699b3661c9968347de8bad97c0',
E 'output_msat': 50000000000,
E 'primary_tag': 'to_them',
E 'utxo': 'a499419bfdce179727cffca45429151db47839b247d83f71837429f021ae6322:4',
E },
E ]
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
… either order.
It's done as HTLCs finalize, but we can close the incoming HTLC as soon as we get the
preimage, so that entire thing could finish before the outgoing HTLC.
```
> check_channel_moves(l1, expected_channel1)
tests/test_coinmoves.py:307:
...
E Full diff:
E [
E {
E 'account_id': '58d371ab100e0ea847a11c9550add273ef8531bc12bb51b0e30c8f833506a772',
E 'created_index': 1,
E 'credit_msat': 0,
E 'debit_msat': 1000000,
E 'fees_msat': 0,
E 'group_id': 1318196858430961660,
E 'part_id': 1,
E 'payment_hash': '8da829ab29715106a4e767facc0b58776ae5bfc11c4e9dcda3063013e1ef8ef0',
E 'primary_tag': 'invoice',
E },
E {
E 'account_id': '0b872506f67b363803cd85cf9ff6807ebc1dc8a4521aa191386b4c5366d490d7',
E 'created_index': 2,
E 'credit_msat': 100000,
E 'debit_msat': 0,
E 'fees_msat': 0,
E 'primary_tag': 'pushed',
E },
E {
E + 'account_id': '0b872506f67b363803cd85cf9ff6807ebc1dc8a4521aa191386b4c5366d490d7',
E + 'created_index': 3,
E + 'credit_msat': 10000100001,
E + 'debit_msat': 0,
E + 'fees_msat': 100001,
E + 'payment_hash': '0ebfa5387de5fd12c15089833b0193fb6007e9f494ec24d479e327a96ac8e8c0',
E + 'primary_tag': 'routed',
E + },
E + {
E 'account_id': '58d371ab100e0ea847a11c9550add273ef8531bc12bb51b0e30c8f833506a772',
E - 'created_index': 3,
E ? ^
E + 'created_index': 4,
E ? ^
E 'credit_msat': 0,
E 'debit_msat': 10000000000,
E 'fees_msat': 100001,
E 'payment_hash': '0ebfa5387de5fd12c15089833b0193fb6007e9f494ec24d479e327a96ac8e8c0',
E 'primary_tag': 'routed',
E },
E - {
E - 'account_id': '0b872506f67b363803cd85cf9ff6807ebc1dc8a4521aa191386b4c5366d490d7',
E - 'created_index': 4,
E - 'credit_msat': 10000100001,
E - 'debit_msat': 0,
E - 'fees_msat': 100001,
E - 'payment_hash': '0ebfa5387de5fd12c15089833b0193fb6007e9f494ec24d479e327a96ac8e8c0',
E - 'primary_tag': 'routed',
E - },
E ]
```
If we don't wait for the channel announcement to be processed, we can get bad gossip: ``` lightningd-2 2026-01-08T04:53:53.795Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-chan#2: Funding tx 2f41b1cc99dea016b7feddbeb1f31ae21b30f56d77ecb2ecb2b2f0faff4808fe depth 12 of 1 lightningd-2 2026-01-08T04:53:53.795Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-chan#2: Funding tx 2f41b1cc99dea016b7feddbeb1f31ae21b30f56d77ecb2ecb2b2f0faff4808fe confirmed, but peer in state ONCHAIN lightningd-2 2026-01-08T04:53:53.802Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-onchaind-chan#2: Got new message WIRE_ONCHAIND_DEPTH lightningd-2 2026-01-08T04:53:53.802Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-onchaind-chan#2: Sending 0 missing htlc messages lightningd-2 2026-01-08T04:53:53.802Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-onchaind-chan#2: FUNDING_TRANSACTION/FUNDING_OUTPUT->MUTUAL_CLOSE depth 6 lightningd-2 2026-01-08T04:53:53.802Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-onchaind-chan#2: billboard: All outputs resolved: waiting 94 more blocks before forgetting channel lightningd-2 2026-01-08T04:53:53.812Z DEBUG gossipd: gossmap_manage: new block, adding 104x1x1 to pending... lightningd-2 2026-01-08T04:53:53.812Z DEBUG gossipd: REPLY WIRE_GOSSIPD_NEW_BLOCKHEIGHT_REPLY with 0 fds lightningd-1 2026-01-08T04:53:53.819Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-connectd: peer_in WIRE_WARNING lightningd-1 2026-01-08T04:53:53.820Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-connectd: Received WIRE_WARNING: WARNING: channel_announcement: no unspent txout 104x1x1 lightningd-2 2026-01-08T04:53:53.820Z TRACE gossipd: channel_announcement: got reply for 104x1x1... lightningd-2 2026-01-08T04:53:53.820Z TRACE 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-gossipd: Bad gossip order: channel_announcement: no unspent txout 104x1x1 lightningd-2 2026-01-08T04:53:53.820Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: peer_out WIRE_WARNING ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is based on #8767 (which was causing one of the "flakes"!).
This is a draft because the final commit disables all flakes so I can catch them (I'm also looping it on my laptop), so I can root cause the others. We may still mark some flaky if they are genuinely unreliable, but we need to make that determination!
Changelog-None