Implement Metadata for SMAC enabling Multi-Fidelity #771
Implement Metadata for SMAC enabling Multi-Fidelity #771jsfreischuetz wants to merge 28 commits intomicrosoft:mainfrom
Conversation
mlos_core/mlos_core/optimizers/bayesian_optimizers/smac_optimizer.py
Outdated
Show resolved
Hide resolved
mlos_core/mlos_core/optimizers/bayesian_optimizers/smac_optimizer.py
Outdated
Show resolved
Hide resolved
| self.trial_info_map: Dict[ConfigSpace.Configuration, TrialInfo] = {} | ||
| self.trial_info_df: pd.DataFrame = pd.DataFrame( | ||
| columns=["Configuration", "Metadata", "TrialInfo", "TrialValue"] | ||
| ) |
There was a problem hiding this comment.
Why replacing this with a dataframe?
There was a problem hiding this comment.
Because configurations are no longer a unique identifier if we support metadata.
An alternative would be to use the type:
Dict[Tuple[ConfigSpace.Configuration, pd.DataFrame], Tuple[TrialInfo, TrialValue]]
but I thought using a dataframe was simpler and more flexible
There was a problem hiding this comment.
Not certain you can actually use that Tuple as a Dict key (may not be hashable).
There was a problem hiding this comment.
Tuples are hashable, but dataframes are not
mlos_core/mlos_core/optimizers/bayesian_optimizers/smac_optimizer.py
Outdated
Show resolved
Hide resolved
mlos_core/mlos_core/optimizers/bayesian_optimizers/smac_optimizer.py
Outdated
Show resolved
Hide resolved
| **filter_kwargs(facade, **kwargs), | ||
| ) | ||
|
|
||
| self.lock = threading.Lock() |
There was a problem hiding this comment.
Please add some comments regarding the need for the threading lock.
mlos_core/mlos_core/optimizers/bayesian_optimizers/smac_optimizer.py
Outdated
Show resolved
Hide resolved
| [df_ctx.equals(ctx) for df_ctx in self.trial_info_df["Metadata"]] | ||
| ) | ||
|
|
||
| # make a new entry |
There was a problem hiding this comment.
Comments could use some improvement.
Make a new entry of what? What's the first part doing that this one now needs explanation?
mlos_core/mlos_core/optimizers/bayesian_optimizers/smac_optimizer.py
Outdated
Show resolved
Hide resolved
mlos_core/mlos_core/optimizers/bayesian_optimizers/smac_optimizer.py
Outdated
Show resolved
Hide resolved
mlos_core/mlos_core/optimizers/bayesian_optimizers/smac_optimizer.py
Outdated
Show resolved
Hide resolved
mlos_core/mlos_core/optimizers/bayesian_optimizers/smac_optimizer.py
Outdated
Show resolved
Hide resolved
| for _, _, _, metadata in observations]).reset_index(drop=True) | ||
| return (configs, scores, contexts if len(contexts.columns) > 0 else None, metadatas if len(metadatas.columns) > 0 else None) | ||
|
|
||
| def get_observations(self) -> Tuple[pd.DataFrame, pd.DataFrame, Optional[pd.DataFrame], Optional[pd.DataFrame]]: |
There was a problem hiding this comment.
These Tuples are getting a little large and hard to read (recall a previous version of this PR where the order of them was mistakenly swapped at one point).
Think we discussed creating a NamedTuple or small DataClass for them instead so that they can be accessed by name in order to make it more readable.
There was a problem hiding this comment.
If you want I can do this in this PR, or another follow up PR
There was a problem hiding this comment.
I think a predecessor PR would be better. Much like we did with adding the metadata args and named args first.
| @pytest.mark.parametrize(('optimizer_type', 'verify', 'kwargs'), [ | ||
| # Enumerate all supported Optimizers | ||
| *[(member, verify, {"seed": SEED, "facade": MFFacade, "intensifier": SuccessiveHalving, "min_budget": 1, "max_budget": 9}) | ||
| for member, verify in [(OptimizerType.SMAC, smac_verify_best)]], |
There was a problem hiding this comment.
This is a pretty weird pattern. If we wanted to add more types in the future those kwargs wouldn't work.
What we've done elsewhere is
- retain the loop over
OptimizerTypein order to make sure that all optimizers get a test added (or explicitly excluded) whenever we add new optimizer backends. - When necessary, within the body of the test set additional parameters in a switch statement for a given optimizer type. e.g.,
if optimizer_type == OptimizerType.SMAC:
verifier = smac_verify_best
elif optimizer_type == OptimizerType.FLAML:
pytest.skip("TODO: FLAML Optimizer does not yet support metadata") # though even here, I think we should test *something*
else:
raise NotImplementedError(f"Missing test handler for OptimizerType {optimizer_type}")There was a problem hiding this comment.
I have moved the kwargs into the tuple which fixes the kwargs problem.
I find switching over unsupported optimizers messy since it means, in this case, we should skip the entire test as you have above, unless we are testing SMAC. There are other places where we test to make sure that using metadata throws exceptions, so without metadata support there is nothing to test.
There was a problem hiding this comment.
But when we add that functionality to another optimizer it will be very easy to forget to enable related testing support, so I'd rather add that now as it will be easier to search for what we need to enable.
There was a problem hiding this comment.
Won't it be the same either way? One requires adding a line, the other requires moving a line. I make this more similar to other tests, but I don't think it changes much either way.
mlos_core/mlos_core/tests/optimizers/optimizer_metadata_test.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
…zer.py Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
…zer.py Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
…zer.py Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
…zer.py Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
…zer.py Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
…zer.py Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
…zer.py Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Enables changing the defaults in the SMAC optimizer to enable state passing (what SMAC internally calls context, but that doesn't match MLOS's notion of context) and multi-fidelity tuning.
This required a few additional changes to metadata throughout the project
Additionally, this PR includes a test case for Multi-Fidelity Tuning
This PR replaces the previous one (#751)