Skip to content

Allow metadce to track arrow functions#27053

Open
valadaptive wants to merge 2 commits into
emscripten-core:mainfrom
valadaptive:fix-metadce
Open

Allow metadce to track arrow functions#27053
valadaptive wants to merge 2 commits into
emscripten-core:mainfrom
valadaptive:fix-metadce

Conversation

@valadaptive
Copy link
Copy Markdown
Contributor

@valadaptive valadaptive commented Jun 4, 2026

Resolves #26038. Probably also resolves #22534.

Some libc functions are implemented in a circular way: the C-side helper functions call JS helpers, which may call those same C helpers. As I stated in #26838:

The setitimer libc function calls _emscripten_timeout, which calls the JS-side _setitimer_js. This adds a dependency on the _setitimer_js function from libc.

Then on the JS side, _setitimer_js calls back into the WASM-side _emscripten_timeout, adding a dependency on it.

Emscripten is supposed to have a "metadce" pass (emitDCEGraph in acorn-optimizer.mjs), whose purpose is to track and clean up precisely those circular dependencies.

However, metadce has been silently broken since #19539 back in 2023. While it was supposedly NFC, #19539 changes JS library functions to use arrow functions to "save code size", and the metadce pass does not track arrow functions. As a result, the pass essentially does nothing now.

This PR adds support to the metadce pass for tracking top-level arrow-function variable declarations. This allows it to properly track (and eliminate) circular JS<->wasm dependencies.

A disclaimer: the code was generated by Claude. I did review the code (and write this PR description) myself, and the added test case should ensure that various edge cases (e.g. reassignments) are handled properly. I checked CONTRIBUTING.md, and there was no explicit guidance on AI-generated submissions. This is a fairly localized change (~a dozen lines of actual new code), so the review burden should not be too high.

Since the metadce pass has (as far as I can tell) been a no-op since 2023, there's a risk that this may shake some other bugs loose (e.g. if any code has been added since then which implicitly relies on metadce not doing anything).

A couple other things I want some guidance on:

  • The defuns array is currently expected to hold Acorn FunctionDeclaration nodes. The new arrow-function branch creates a plain object shaped like a FunctionDeclaration, but it may be cleaner to just change what defuns holds. That would be a larger change, however.
  • I've added an emitDCEGraph-vardefs.js test, but have not specifically added any integration tests / regression tests for ensuring functions like __setitimer_js get removed when LTO is enabled. Where would those go? (EDIT: looks like test_codesize_cxx_lto.json tracks this already.)

This is an automatic change generated by tools/maint/rebaseline_tests.py.

The following (1) test expectation files were updated by
running the tests with `--rebaseline`:

```
codesize/test_codesize_cxx_lto.json: 120848 => 120337 [-511 bytes / -0.42%]

Average change: -0.42% (-0.42% - -0.42%)
```
@kripken
Copy link
Copy Markdown
Member

kripken commented Jun 4, 2026

However, metadce has been silently broken since #19539 back in 2023.

Hmm, the code size updates in that PR show a general improvement - if we had broken metadce entirely, there should be a very large regression, and there isn't. So there might be an issue here, but I think the diagnosis can't be that metadce was entirely broken?

For comparison, I tested on main right now, and disabling metadce manually leads to a 2% code size regression on hello world, and 26% on hello_libcxx. So metadce is definitely working in general - though, again, maybe you found some part of it that broke.

I don't see code size updates in this PR - tests look broken - but perhaps after you fix those we can see what changes?

@kripken
Copy link
Copy Markdown
Member

kripken commented Jun 4, 2026

Oh, there is one code size update here, which looks positive, confirming that there is something to fix here, nice.

@valadaptive
Copy link
Copy Markdown
Contributor Author

The build-linux CI run seems to have timed out while fetching the Lua source. Not sure what's going on with test_pthread_print_override_modularize in test-deno; I'll have to dig into that.

@kripken
Copy link
Copy Markdown
Member

kripken commented Jun 4, 2026

I restarted build-linux, looks like it was a flake.

@sbc100
Copy link
Copy Markdown
Collaborator

sbc100 commented Jun 4, 2026

test_pthread_print_override_modularize is just a flaky today. Sorry about that.

Copy link
Copy Markdown
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats for looking into this. The analysis looks correct to me. I guess we simply never updated metadce to handler arrow functions. Oops.

Comment thread tools/acorn-optimizer.mjs
// References through the name are attributed to this function.
// That is precise only if the name is never reassigned (generated
// code never reassigns these); if it were, the assignment would
// root this node (the assignment target is an identifier use), so
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the reassignment do that?

Comment thread tools/acorn-optimizer.mjs
// earlier same-named function) do not track this one: re-mapping
// the name would misattribute references to the wrong node, which
// could lead to unsound removals. Left untracked, the contents
// are treated as top-level code, which is conservative.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, but if we ignore it here, we keep the old tracking, correct? So it is not left untracked. Why is that sound?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

__setitimer_js is always included in the glue code Code size regressions with WASI, environ, libc/filesystem

3 participants