Skip to content

[Depends on #3789] Implementing multistart version of theta_est using multiple sampling methods#3575

Draft
sscini wants to merge 153 commits intoPyomo:mainfrom
sscini:multistart-in-parmest
Draft

[Depends on #3789] Implementing multistart version of theta_est using multiple sampling methods#3575
sscini wants to merge 153 commits intoPyomo:mainfrom
sscini:multistart-in-parmest

Conversation

@sscini
Copy link
Contributor

@sscini sscini commented Apr 23, 2025

Fixes # .

Summary/Motivation:

Currently, the optimization is only done from a single initial value. This implementation adds the ability to specify multiple initial values using selected sampling techniques: from a random uniform distribution, using Latin Hypercube Sampling, or using Sobol Quasi-Monte Carlo sampling.

Changes proposed in this PR:

  • All changes made adding pseudocode in comments
  • Added inputs needed for multistart simulation
  • Added a function to generate points using the selected method
  • Added theta_est_multistart to work for the multistart process

TODO before converting from draft:

  • Receive feedback from collaborators on logical setup
  • Convert finalized pseudocode
  • Test and debug
  • Confirm function with examples

Legal Acknowledgement

By contributing to this software project, I have read the contribution guide and agree to the following terms and conditions for my contribution:

  1. I agree my contributions are submitted under the BSD license.
  2. I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

@sscini
Copy link
Contributor Author

sscini commented Apr 23, 2025

@djlaky @adowling2 Please provide early feedback.

@sscini
Copy link
Contributor Author

sscini commented Apr 30, 2025

Dynamic saving using flush, add.

Copy link
Member

@adowling2 adowling2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes from our in-person discussion/informal code review

# # If only one restart, return an empty list
# return []

# return {theta_names[i]: initial_theta[i] for i in range(len(theta_names))}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed adding a "dataframe" sampling method that uses multistart points defined by the user. This is helpful if we want to try the same set of multistart points for multiple experiments.

"Multistart is not supported in the deprecated parmest interface")
)

assert isinstance(n_restarts, int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also check that this is > 1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please look at other Pyomo code fgor exampels of throwing exceptions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @adowling2 here, you need to throw an exception so you can test the exception is caught.

)


results = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make more sense to create a dataframe and then add rows as you go. Or you could preallocate the dataframe size because you know how many restarts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could even have your generate_samples function generate this empty dataframe.

@sscini
Copy link
Contributor Author

sscini commented Apr 30, 2025

Extend existing tests for parmest to include multistart, add.

@sscini
Copy link
Contributor Author

sscini commented Apr 30, 2025

Models provided need to include bounds, add exception

Copy link
Member

@adowling2 adowling2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some more comments for you to consider are you continue to refine this.

upper_bound = np.array([parmest_model.find_component(name).ub for name in theta_names])
# Check if the lower and upper bounds are defined
if np.any(np.isnan(lower_bound)) or np.any(np.isnan(upper_bound)):
raise ValueError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably already know this, but you will need to check all the errors are raised when expected.

elif self.method == "latin_hypercube":
# Generate theta values using Latin hypercube sampling
sampler = scipy.stats.qmc.LatinHypercube(d=len(theta_names), seed=seed)
samples = sampler.random(n=self.n_restarts+1)[1:] # Skip the first sample
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you skipping the first sample? Please explain in the comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a comment in code to explain as well. The first sample generated using qmc.sobol is always the origin (zero vector). I thought logic applied to all qmc methods, but no only sobol. So to get nonzero points, you need to skip first sample

"Multistart is not supported in the deprecated parmest interface"
)

assert isinstance(self.n_restarts, int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace all of these with more descriptive error messages. Remember that we need tests for each error message.

# optimization. It will take the theta names and the initial theta values
# and return a dictionary of theta names and their corresponding values.
def _generate_initial_theta(self, parmest_model, seed=None, n_restarts=None, multistart_sampling_method=None, user_provided=None):
if n_restarts == 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like just sending a warning, and not returning. For example, n_restarts might be 1 by default. You should check if n_restarts is an int as well. Then, if n_restarts is 1, you should send a warning that the tool is intended for this number to be greater than one and solve as normal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alex had initially said just return message, but I will ask him again

@sscini sscini reopened this Feb 19, 2026
@sscini sscini moved this from Paused to Development in ParmEst & Pyomo.DoE Development Feb 19, 2026
@sscini
Copy link
Contributor Author

sscini commented Feb 24, 2026

Notes for 02/24/26 development meeting:
Will need to merge _Q_opt update (PR 3789) first
Variant of objective_at_theta that estimates parameters at each parameter value set.
Still need to:

  • Update implementation of sample generation using random sampling techniques
  • Define defaults for sampling technique, number of restarts, etc
  • Generate examples from literature with known solutions.
  • Testing
  • Documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Development

Successfully merging this pull request may close these issues.

4 participants