spv/lift: elide empty targets of an if-else/switch, as allowed by SPIR-V.#28
spv/lift: elide empty targets of an if-else/switch, as allowed by SPIR-V.#28eddyb wants to merge 2 commits intoRust-GPU:mainfrom
if-else/switch, as allowed by SPIR-V.#28Conversation
LegNeato
left a comment
There was a problem hiding this comment.
Do we have tests? This seems super testable.
src/spv/lift.rs
Outdated
| .then(|| targets[0]) | ||
| }; | ||
| if let Some(target_of_target) = target_is_trivial_branch { | ||
| // FIXME(eddyb) what does it mean for this to not be true? |
There was a problem hiding this comment.
Let's put an error or panic in here so if we hit it we are aware.
There was a problem hiding this comment.
This is definitely possible, I suspect I tunnel-vision'd the situation when writing that comment - at the end of the day, it doesn't matter for correctness, as long as the check is conservative enough (as this is an opportunistic optimization).
(The comment is gone in the latest version I pushed)
02cacc0 to
8bedb88
Compare
Sadly not really (beyond some very minimal stuff) - in the past I was looking at several testing approaches, but I wouldn't want to let that block landing any of the open PRs (before I transition away from using forks to stage PRs - AFAIK I can't change existing PRs' target branches). |
SPIR-V allows what it calls "merge edges" (edges from a structured header block, to its merge block), e.g.:
can directly use the merge block as the "else" edge of the conditional branch:
but before this PR, SPIR-T -> SPIR-V lifting was not taking advantage of that, instead emitting:
This pattern became most obvious on Rust-GPU shaders with a lot of panics from e.g. bounds-checking (which get turned into conditional "
OpReturnfrom entry point"), but not from inspecting the SPIR-V output.Instead, I ended up noticing that
spirv-opthas amerge-blockspass, which can remove the slight inefficiency from situations like the above example, but it's very slow for what it does (and it was the slowest pass inspirv-opt --time-report, for the SPIR-V module I was looking at).(if linear, that would mean around 200µs per trivial basic block, which feels absurd given how simple the check in this PR is, so I suspect it's accidentally quadratic or worse, but I haven't looked at the C++ code to confirm either)
EDIT: with this PR used in Rust-GPU, the specific testcase which prompted it (schell/renderling@d9f4d6f) has the time spent in
spirv-optshrink by ~20% (4s->3.21s), which is even better than the0.5sI was expecting.