Allow metadce to track arrow functions#27053
Conversation
This is an automatic change generated by tools/maint/rebaseline_tests.py. The following (1) test expectation files were updated by running the tests with `--rebaseline`: ``` codesize/test_codesize_cxx_lto.json: 120848 => 120337 [-511 bytes / -0.42%] Average change: -0.42% (-0.42% - -0.42%) ```
Hmm, the code size updates in that PR show a general improvement - if we had broken metadce entirely, there should be a very large regression, and there isn't. So there might be an issue here, but I think the diagnosis can't be that metadce was entirely broken? For comparison, I tested on main right now, and disabling metadce manually leads to a 2% code size regression on hello world, and 26% on hello_libcxx. So metadce is definitely working in general - though, again, maybe you found some part of it that broke. I don't see code size updates in this PR - tests look broken - but perhaps after you fix those we can see what changes? |
|
Oh, there is one code size update here, which looks positive, confirming that there is something to fix here, nice. |
|
The |
|
I restarted |
|
test_pthread_print_override_modularize is just a flaky today. Sorry about that. |
sbc100
left a comment
There was a problem hiding this comment.
Thats for looking into this. The analysis looks correct to me. I guess we simply never updated metadce to handler arrow functions. Oops.
| // References through the name are attributed to this function. | ||
| // That is precise only if the name is never reassigned (generated | ||
| // code never reassigns these); if it were, the assignment would | ||
| // root this node (the assignment target is an identifier use), so |
There was a problem hiding this comment.
Where does the reassignment do that?
| // earlier same-named function) do not track this one: re-mapping | ||
| // the name would misattribute references to the wrong node, which | ||
| // could lead to unsound removals. Left untracked, the contents | ||
| // are treated as top-level code, which is conservative. |
There was a problem hiding this comment.
Hmm, but if we ignore it here, we keep the old tracking, correct? So it is not left untracked. Why is that sound?
Resolves #26038. Probably also resolves #22534.
Some libc functions are implemented in a circular way: the C-side helper functions call JS helpers, which may call those same C helpers. As I stated in #26838:
Emscripten is supposed to have a "metadce" pass (
emitDCEGraphin acorn-optimizer.mjs), whose purpose is to track and clean up precisely those circular dependencies.However, metadce has been silently broken since #19539 back in 2023. While it was supposedly NFC, #19539 changes JS library functions to use arrow functions to "save code size", and the metadce pass does not track arrow functions. As a result, the pass essentially does nothing now.
This PR adds support to the metadce pass for tracking top-level arrow-function variable declarations. This allows it to properly track (and eliminate) circular JS<->wasm dependencies.
A disclaimer: the code was generated by Claude. I did review the code (and write this PR description) myself, and the added test case should ensure that various edge cases (e.g. reassignments) are handled properly. I checked CONTRIBUTING.md, and there was no explicit guidance on AI-generated submissions. This is a fairly localized change (~a dozen lines of actual new code), so the review burden should not be too high.
Since the metadce pass has (as far as I can tell) been a no-op since 2023, there's a risk that this may shake some other bugs loose (e.g. if any code has been added since then which implicitly relies on metadce not doing anything).
A couple other things I want some guidance on:
defunsarray is currently expected to hold AcornFunctionDeclarationnodes. The new arrow-function branch creates a plain object shaped like aFunctionDeclaration, but it may be cleaner to just change whatdefunsholds. That would be a larger change, however.I've added an(EDIT: looks likeemitDCEGraph-vardefs.jstest, but have not specifically added any integration tests / regression tests for ensuring functions like__setitimer_jsget removed when LTO is enabled. Where would those go?test_codesize_cxx_lto.jsontracks this already.)