Skip to content

Add alpha to args#612

Open
PepeRoConde wants to merge 1 commit into
bayesian-optimization:masterfrom
PepeRoConde:master
Open

Add alpha to args#612
PepeRoConde wants to merge 1 commit into
bayesian-optimization:masterfrom
PepeRoConde:master

Conversation

@PepeRoConde
Copy link
Copy Markdown

@PepeRoConde PepeRoConde commented May 31, 2026

In Gaussian Processes for Machine Learning book Chapter 2.2's Algorithm 2.1 presents the algorithm used by Scikit Learn of Gaussian Process. There appears a term $\sigma^2$ that accounts for assuming high variance in the errors. Scikit learn defaulted that to 1e-10 and you choose a magic number of 1e-6. I think it's enough motivated to pass the election of that parameter (Scikit call it alpha) to the user.

Chapter 6 of that book talks about regularization and here Scikit has a motivating example of its usage.

Summary by CodeRabbit

  • New Features
    • Gaussian Process regularization can now be customized when initializing the optimizer. The alpha parameter is now configurable, allowing users to fine-tune model behavior while maintaining backward compatibility with the default value.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 31, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

The PR makes Gaussian Process regularization configurable by adding an alpha parameter to BayesianOptimization.__init__. The parameter defaults to 1e-6 and is passed to GaussianProcessRegressor instead of using a hardcoded constant, enabling users to adjust regularization behavior through the optimizer's public API.

Changes

Gaussian Process Regularization Configuration

Layer / File(s) Summary
GP alpha parameter configuration
bayes_opt/bayesian_optimization.py
Constructor signature accepts alpha parameter with default 1e-6, documented in the docstring, and passed to GaussianProcessRegressor(alpha=alpha) during GP initialization.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

🐰 A rabbit hops through optimization's field,
Where alpha now bends to the user's will,
No longer hardcoded in stone,
But configurable, tuned, and fully grown! ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title is vague and uses generic terminology that doesn't clearly convey the main change. 'Add alpha to args' lacks context about what alpha represents or why it matters to users. Consider a more descriptive title like 'Make GP regularization parameter configurable' or 'Add alpha parameter to BayesianOptimization' to better reflect that this adds a user-facing parameter for Gaussian Process noise control.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
bayes_opt/bayesian_optimization.py (1)

72-75: 💤 Low value

Consider clarifying the docstring to match scikit-learn terminology.

The docstring could be more precise about what alpha does mathematically. Scikit-learn's documentation describes alpha as "Value added to the diagonal of the kernel matrix during fitting," which is clearer than "adjust to an error gaussian distribution with high variance."

📝 Suggested docstring improvement
-    alpha: float, optional(default=1e-6)
-        Regularization parameter to prevent numerical issues or to adjust to an
-        error gaussian distribution with high variance. The higher, the higher
-        the regularization.
+    alpha: float, optional(default=1e-6)
+        Value added to the diagonal of the kernel matrix during fitting for
+        numerical stability. Can also account for noise variance in observations.
+        Larger values increase regularization.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bayes_opt/bayesian_optimization.py` around lines 72 - 75, Update the alpha
parameter docstring in bayesian_optimization.py to use scikit-learn terminology:
state that alpha (float, default=1e-6) is the value added to the diagonal of the
kernel (covariance) matrix during fitting to provide numerical
regularization/stability (i.e., prevents ill-conditioned matrices), and
optionally mention it acts as a small Gaussian noise term rather than the
current vague "adjust to an error gaussian distribution" wording; locate and
edit the alpha description in the docstring block where alpha is defined.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@bayes_opt/bayesian_optimization.py`:
- Line 97: In BayesianOptimization.__init__, add an explicit validation that the
alpha parameter is strictly positive (alpha > 0) before it’s passed to
sklearn.gaussian_process.GaussianProcessRegressor; if the check fails raise a
ValueError with a clear message (e.g., "alpha must be > 0, got {alpha}") so the
constructor fails fast and avoids later GP fitting errors. Ensure this check
sits at the top of __init__ (next to the alpha parameter handling) and covers
both int/float inputs and rejects non-positive values.

---

Nitpick comments:
In `@bayes_opt/bayesian_optimization.py`:
- Around line 72-75: Update the alpha parameter docstring in
bayesian_optimization.py to use scikit-learn terminology: state that alpha
(float, default=1e-6) is the value added to the diagonal of the kernel
(covariance) matrix during fitting to provide numerical regularization/stability
(i.e., prevents ill-conditioned matrices), and optionally mention it acts as a
small Gaussian noise term rather than the current vague "adjust to an error
gaussian distribution" wording; locate and edit the alpha description in the
docstring block where alpha is defined.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3b3ca308-98a8-470e-94bc-100ecf649a75

📥 Commits

Reviewing files that changed from the base of the PR and between 5b8fb73 and 2e2b164.

📒 Files selected for processing (1)
  • bayes_opt/bayesian_optimization.py

acquisition_function: AcquisitionFunction | None = None,
constraint: NonlinearConstraint | None = None,
random_state: int | RandomState | None = None,
alpha: float = 1e-6,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Does scikit-learn GaussianProcessRegressor validate that the alpha parameter is positive?

💡 Result:

No clear positive-only validation is enforced for GaussianProcessRegressor(alpha=...). In the public API docs, alpha is described as a nonnegative “value added to the diagonal” to help ensure a positive definite matrix, but the docs do not explicitly state that scikit-learn validates alpha > 0 at input time [1]. In the current scikit-learn implementation file for GaussianProcessRegressor (sklearn/gaussian_process/_gpr.py), the fitting code shown performs shape/compatibility checks (e.g., alpha must be a scalar or an array with the same number of entries as the training data), but the snippet does not include any explicit check rejecting alpha <= 0 [2]. Therefore, based on the evidence found in the official documentation and the examined implementation, scikit-learn does not explicitly validate that alpha is strictly positive; instead, incorrect (e.g., nonpositive) values may lead to numerical issues later (e.g., failure to produce a positive definite matrix), rather than being rejected by an immediate ValueError [1][2].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="bayes_opt/bayesian_optimization.py"
# Show a small window around the line mentioned in the review comment
nl -ba "$FILE" | sed -n '70,130p'
# Also search for any mention of alpha in the file
rg -n "alpha" "$FILE" || true

Repository: bayesian-optimization/BayesianOptimization

Length of output: 129


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="bayes_opt/bayesian_optimization.py"
nl -ba "$FILE" | sed -n '70,130p'
rg -n "alpha" "$FILE" || true

Repository: bayesian-optimization/BayesianOptimization

Length of output: 129


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="bayes_opt/bayesian_optimization.py"

awk 'NR>=70 && NR<=130 {printf "%d\t%s\n", NR, $0}' "$FILE"
echo "----"
rg -n "alpha" "$FILE" || true

Repository: bayesian-optimization/BayesianOptimization

Length of output: 2876


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="bayes_opt/bayesian_optimization.py"
awk 'NR>=120 && NR<=170 {printf "%d\t%s\n", NR, $0}' "$FILE"

Repository: bayesian-optimization/BayesianOptimization

Length of output: 2137


Validate alpha is strictly positive in BayesianOptimization.__init__

alpha is passed directly to sklearn.gaussian_process.GaussianProcessRegressor(alpha=...), and scikit-learn doesn’t enforce alpha > 0 at input time—nonpositive values may fail later during GP fitting. Add an explicit check to fail fast with a clear ValueError.

🛡️ Suggested validation
 def __init__(
     self,
     f: Callable[..., float] | None,
     pbounds: BoundsMapping,
     acquisition_function: AcquisitionFunction | None = None,
     constraint: NonlinearConstraint | None = None,
     random_state: int | RandomState | None = None,
     alpha: float = 1e-6,
     verbose: int = 2,
     bounds_transformer: DomainTransformer | None = None,
     allow_duplicate_points: bool = False,
 ):
+    if alpha <= 0:
+        msg = "alpha must be positive"
+        raise ValueError(msg)
     self._random_state = ensure_rng(random_state)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
alpha: float = 1e-6,
def __init__(
self,
f: Callable[..., float] | None,
pbounds: BoundsMapping,
acquisition_function: AcquisitionFunction | None = None,
constraint: NonlinearConstraint | None = None,
random_state: int | RandomState | None = None,
alpha: float = 1e-6,
verbose: int = 2,
bounds_transformer: DomainTransformer | None = None,
allow_duplicate_points: bool = False,
):
if alpha <= 0:
msg = "alpha must be positive"
raise ValueError(msg)
self._random_state = ensure_rng(random_state)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bayes_opt/bayesian_optimization.py` at line 97, In
BayesianOptimization.__init__, add an explicit validation that the alpha
parameter is strictly positive (alpha > 0) before it’s passed to
sklearn.gaussian_process.GaussianProcessRegressor; if the check fails raise a
ValueError with a clear message (e.g., "alpha must be > 0, got {alpha}") so the
constructor fails fast and avoids later GP fitting errors. Ensure this check
sits at the top of __init__ (next to the alpha parameter handling) and covers
both int/float inputs and rejects non-positive values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant