[Depends on #3789] Implementing multistart version of theta_est using multiple sampling methods#3575
[Depends on #3789] Implementing multistart version of theta_est using multiple sampling methods#3575sscini wants to merge 153 commits intoPyomo:mainfrom
Conversation
|
@djlaky @adowling2 Please provide early feedback. |
|
Dynamic saving using flush, add. |
adowling2
left a comment
There was a problem hiding this comment.
Notes from our in-person discussion/informal code review
pyomo/contrib/parmest/parmest.py
Outdated
| # # If only one restart, return an empty list | ||
| # return [] | ||
|
|
||
| # return {theta_names[i]: initial_theta[i] for i in range(len(theta_names))} |
There was a problem hiding this comment.
We discussed adding a "dataframe" sampling method that uses multistart points defined by the user. This is helpful if we want to try the same set of multistart points for multiple experiments.
pyomo/contrib/parmest/parmest.py
Outdated
| "Multistart is not supported in the deprecated parmest interface") | ||
| ) | ||
|
|
||
| assert isinstance(n_restarts, int) |
There was a problem hiding this comment.
Please look at other Pyomo code fgor exampels of throwing exceptions
There was a problem hiding this comment.
Agree with @adowling2 here, you need to throw an exception so you can test the exception is caught.
pyomo/contrib/parmest/parmest.py
Outdated
| ) | ||
|
|
||
|
|
||
| results = [] |
There was a problem hiding this comment.
It might make more sense to create a dataframe and then add rows as you go. Or you could preallocate the dataframe size because you know how many restarts.
There was a problem hiding this comment.
You could even have your generate_samples function generate this empty dataframe.
|
Extend existing tests for parmest to include multistart, add. |
|
Models provided need to include bounds, add exception |
adowling2
left a comment
There was a problem hiding this comment.
Here are some more comments for you to consider are you continue to refine this.
pyomo/contrib/parmest/parmest.py
Outdated
| upper_bound = np.array([parmest_model.find_component(name).ub for name in theta_names]) | ||
| # Check if the lower and upper bounds are defined | ||
| if np.any(np.isnan(lower_bound)) or np.any(np.isnan(upper_bound)): | ||
| raise ValueError( |
There was a problem hiding this comment.
You probably already know this, but you will need to check all the errors are raised when expected.
pyomo/contrib/parmest/parmest.py
Outdated
| elif self.method == "latin_hypercube": | ||
| # Generate theta values using Latin hypercube sampling | ||
| sampler = scipy.stats.qmc.LatinHypercube(d=len(theta_names), seed=seed) | ||
| samples = sampler.random(n=self.n_restarts+1)[1:] # Skip the first sample |
There was a problem hiding this comment.
Why are you skipping the first sample? Please explain in the comments.
There was a problem hiding this comment.
I will add a comment in code to explain as well. The first sample generated using qmc.sobol is always the origin (zero vector). I thought logic applied to all qmc methods, but no only sobol. So to get nonzero points, you need to skip first sample
pyomo/contrib/parmest/parmest.py
Outdated
| "Multistart is not supported in the deprecated parmest interface" | ||
| ) | ||
|
|
||
| assert isinstance(self.n_restarts, int) |
There was a problem hiding this comment.
Replace all of these with more descriptive error messages. Remember that we need tests for each error message.
pyomo/contrib/parmest/parmest.py
Outdated
| # optimization. It will take the theta names and the initial theta values | ||
| # and return a dictionary of theta names and their corresponding values. | ||
| def _generate_initial_theta(self, parmest_model, seed=None, n_restarts=None, multistart_sampling_method=None, user_provided=None): | ||
| if n_restarts == 1: |
There was a problem hiding this comment.
I like just sending a warning, and not returning. For example, n_restarts might be 1 by default. You should check if n_restarts is an int as well. Then, if n_restarts is 1, you should send a warning that the tool is intended for this number to be greater than one and solve as normal.
There was a problem hiding this comment.
Alex had initially said just return message, but I will ask him again
Did not work for multi-index like in PDEs. This addresses that.
|
Notes for 02/24/26 development meeting:
|
Fixes # .
Summary/Motivation:
Currently, the optimization is only done from a single initial value. This implementation adds the ability to specify multiple initial values using selected sampling techniques: from a random uniform distribution, using Latin Hypercube Sampling, or using Sobol Quasi-Monte Carlo sampling.
Changes proposed in this PR:
TODO before converting from draft:
Legal Acknowledgement
By contributing to this software project, I have read the contribution guide and agree to the following terms and conditions for my contribution: