wip split generationcriterion and transitioncriterion#4854
Open
mgarrard wants to merge 4 commits intofacebook:mainfrom
Open
wip split generationcriterion and transitioncriterion#4854mgarrard wants to merge 4 commits intofacebook:mainfrom
mgarrard wants to merge 4 commits intofacebook:mainfrom
Conversation
37a5978 to
f09803f
Compare
mgarrard
added a commit
to mgarrard/Ax
that referenced
this pull request
Feb 5, 2026
Summary: **TLDR:** This diff splits TransitionCriterion into (1) TransitionCriterion and (2) GenerationBlockingCriterion. I think this makes sense to do because it *greatly* increases the conceptual clarity of the transition criterion. Some ways it does this include: 1. Removal of confusing dual purpose flags — block_transition_if_unmet and block_generation_if_met flags. Now transition criteria are inferred to block transition if unmet and generation criteria are inferred to raise informative errors if the criteria is met. 2. Each criterion contains less flags, and the flags are more directly intuitive. 3. With upcoming removal of special logic for online, we will need to add more generation blocking criteria (ie do we have an opt config), it is better to make this change before adding more criteria that will need to be migrated 4. It will allows the logic for transition and generation to be smoother — this diff keeps things ~= to exisiting logic as possible to minimize diff review overhead, but in subsequent diffs we can save fit time if we know we can’t generate from this node + can’t transition. It will also allow for some further clarification on generation/transition blocking logic that i think is contributing to the confusion of the file 5. i like that creating a new generation blocking criteria with a specific error to raise is easy and painless **Cons of this change:** - it’s a large change, sorry about that. - There is some duplication between TrialBased transition criterion and generation criterion. I explored using a Mixin here, but i find mixins tend to add unnecessary inheritance structures to reason about. **Most important files for review, in order of importance** 1. transition_criterion.py 2. generation_node.py 3. decoder.py 4. encoders.py 5. registery.py 6. generation_strategy_dispatch.py 7. generation_nodes.py 8. generation_strategy.py The remaining files are mainly trivial updates to tests **Note about backwards compatibility:** * This diff will directly decode legacy MaxGenerationParallelism as a generation blocking criterion called MaxGenrationParallelism * Historically, there are some instances of mintrials that have block_gen_if_met=True, this usually comes from enforce_num_trials=True. Now we call this MaxTrialsAwaitingData, and MinTrials is decoded as that. I am open to other, better names for this new criterion. **Other notes/potential improvements:** - we could split transition criterion, generation criterion, and utils into their own files. i kinda like them together, and if we do want to do this split i’d like to do it in a follow up to try to minimize an already v large blast radius Differential Revision: D92201085
f09803f to
2827896
Compare
mgarrard
added a commit
to mgarrard/Ax
that referenced
this pull request
Feb 5, 2026
…erion (facebook#4854) Summary: **TLDR:** This diff splits TransitionCriterion into (1) TransitionCriterion and (2) GenerationBlockingCriterion. I think this makes sense to do because it *greatly* increases the conceptual clarity of the transition criterion. Some ways it does this include: 1. Removal of confusing dual purpose flags — block_transition_if_unmet and block_generation_if_met flags. Now transition criteria are inferred to block transition if unmet and generation criteria are inferred to raise informative errors if the criteria is met. 2. Each criterion contains less flags, and the flags are more directly intuitive. 3. With upcoming removal of special logic for online, we will need to add more generation blocking criteria (ie do we have an opt config), it is better to make this change before adding more criteria that will need to be migrated 4. It will allows the logic for transition and generation to be smoother — this diff keeps things ~= to exisiting logic as possible to minimize diff review overhead, but in subsequent diffs we can save fit time if we know we can’t generate from this node + can’t transition. It will also allow for some further clarification on generation/transition blocking logic that i think is contributing to the confusion of the file 5. i like that creating a new generation blocking criteria with a specific error to raise is easy and painless **Cons of this change:** - it’s a large change, sorry about that. - There is some duplication between TrialBased transition criterion and generation criterion. I explored using a Mixin here, but i find mixins tend to add unnecessary inheritance structures to reason about. **Most important files for review, in order of importance** 1. transition_criterion.py 2. generation_node.py 3. decoder.py 4. encoders.py 5. registery.py 6. generation_strategy_dispatch.py 7. generation_nodes.py 8. generation_strategy.py The remaining files are mainly trivial updates to tests **Note about backwards compatibility:** * This diff will directly decode legacy MaxGenerationParallelism as a generation blocking criterion called MaxGenrationParallelism * Historically, there are some instances of mintrials that have block_gen_if_met=True, this usually comes from enforce_num_trials=True. Now we call this MaxTrialsAwaitingData, and MinTrials is decoded as that. I am open to other, better names for this new criterion. **Other notes/potential improvements:** - we could split transition criterion, generation criterion, and utils into their own files. i kinda like them together, and if we do want to do this split i’d like to do it in a follow up to try to minimize an already v large blast radius Differential Revision: D92201085
Summary:
This criteria updates the completion state logic to assume if a node can transition, and that transition is to itself, then the optimization is complete.
This works because should_transition_to_next_node only considers transtion blocking criteria (ie not max parallelism) when thinking about should transition or not. And if a node points to itself, we can assume that signifies the end of the optimiztion (steps are initialized this way earlier in this stack). this allows allows for the gs to be re-called into, and the tc criterion to change thus putting it back into a non-complete state.
An alternative I considered is to check if all transition edges are completed, and at least one points to self. This would look something like the below snippet. It would be much more expensive to evaluate, and is guarding against a malformed strategy. Edges are already known to be created in order of importance, and self transition edges should be considered ending edges when their importance is considered
```
property
def optimization_complete(self) -> bool:
if len(self._curr.transition_criteria) == 0:
return False
# Check ALL transition edges, not just the first matching one
for next_node, all_tc in self._curr.transition_edges.items():
transition_blocking = [tc for tc in all_tc if tc.block_transition_if_unmet]
if not transition_blocking:
continue
all_met = all(
tc.is_met(experiment=self.experiment, curr_node=self._curr)
for tc in transition_blocking
)
if all_met:
# An edge's criteria are met - check where it points
if next_node != self._curr.name:
return False # Can transition to different node, not complete
# All met edges (if any) point to self
# Check if we actually have any met criteria pointing to self
can_transition, next_node = self._curr.should_transition_to_next_node(
raise_data_required_error=False
)
return can_transition and next_node == self._curr.name
```
The thrid alternative is to instate "compeletion node", which i think could be viable in the future if we have more complex generation strategies than we currently support, and the self generation logic is too cumbersome.
For now though, I think this is a pretty nice simplification that also should have some compute wins. Going from O (number of nodes * number of TC per node), to O(number of tc on current node)
Differential Revision: D91549954
Summary: Since transition_to is now required on transition criterion, we can remove checks/asserts related to none checks. this is a basic no-op simplification. Futher restructuring seperated into a different diff for ease of review Reviewed By: bletham Differential Revision: D91398877
Summary: This method is called many, many times during generation and it's computational cost adds up over time. By cacheing it we can significant improvements in computation time, especially in high trial count regimes. Reviewed By: mpolson64 Differential Revision: D91552553
…erion (facebook#4854) Summary: **TLDR:** This diff splits TransitionCriterion into (1) TransitionCriterion and (2) GenerationBlockingCriterion. I think this makes sense to do because it *greatly* increases the conceptual clarity of the transition criterion. Some ways it does this include: 1. Removal of confusing dual purpose flags — block_transition_if_unmet and block_generation_if_met flags. Now transition criteria are inferred to block transition if unmet and generation criteria are inferred to raise informative errors if the criteria is met. 2. Each criterion contains less flags, and the flags are more directly intuitive. 3. With upcoming removal of special logic for online, we will need to add more generation blocking criteria (ie do we have an opt config), it is better to make this change before adding more criteria that will need to be migrated 4. It will allows the logic for transition and generation to be smoother — this diff keeps things ~= to exisiting logic as possible to minimize diff review overhead, but in subsequent diffs we can save fit time if we know we can’t generate from this node + can’t transition. It will also allow for some further clarification on generation/transition blocking logic that i think is contributing to the confusion of the file 5. i like that creating a new generation blocking criteria with a specific error to raise is easy and painless **Cons of this change:** - it’s a large change, sorry about that. - There is some duplication between TrialBased transition criterion and generation criterion. I explored using a Mixin here, but i find mixins tend to add unnecessary inheritance structures to reason about. **Most important files for review, in order of importance** 1. transition_criterion.py 2. generation_node.py 3. decoder.py 4. encoders.py 5. registery.py 6. generation_strategy_dispatch.py 7. generation_nodes.py 8. generation_strategy.py The remaining files are mainly trivial updates to tests **Note about backwards compatibility:** * This diff will directly decode legacy MaxGenerationParallelism as a generation blocking criterion called MaxGenrationParallelism * Historically, there are some instances of mintrials that have block_gen_if_met=True, this usually comes from enforce_num_trials=True. Now we call this MaxTrialsAwaitingData, and MinTrials is decoded as that. I am open to other, better names for this new criterion. **Other notes/potential improvements:** - we could split transition criterion, generation criterion, and utils into their own files. i kinda like them together, and if we do want to do this split i’d like to do it in a follow up to try to minimize an already v large blast radius Differential Revision: D92201085
2827896 to
211421b
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4854 +/- ##
========================================
Coverage 96.76% 96.76%
========================================
Files 589 589
Lines 61832 61980 +148
========================================
+ Hits 59831 59977 +146
- Misses 2001 2003 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary: wip
Differential Revision: D92201085