fix: resolve evaluation metrics bugs in Classification_Transformers by rixav77 · Pull Request #229 · ML4SCI/DeepLense

rixav77 · 2026-05-07T08:31:44Z

Summary

Fixes #192

Three bugs in the evaluation pipeline of DeepLense_Classification_Transformers_Archil_Srivastava produce incorrect metrics:

Bug 1: `micro_auroc` always NaN

The micro_auroc list is initialized but never populated — np.mean([]) silently returns nan. Added the missing auroc_fn(..., average="micro") call.

Bug 2: Missing `softmax` dimension

torch.nn.functional.softmax(metrics["logits"]) on line 172 omits the dim argument, triggering a deprecation warning and potentially incorrect behavior. Line 158 already uses dim=-1 correctly — applied the same fix for consistency.

Bug 3: Hardcoded W&B entity

entity="_archil" is hardcoded in both train.py and eval.py, causing authentication errors for other contributors. Replaced with a configurable --entity CLI argument that defaults to the WANDB_ENTITY environment variable (or None if unset, which lets W&B use the logged-in user's default entity).

Changes

eval.py: Add micro_auroc computation, add dim=-1 to softmax, add --entity arg
train.py: Add --entity arg, replace hardcoded entity

Test plan

Verify micro_auroc is no longer NaN after evaluation
Verify no deprecation warning from softmax call
Verify training works without --entity flag (defaults to logged-in W&B user)
Verify --entity my_team overrides correctly

Fixes ML4SCI#192 - Add missing micro_auroc computation (was initialized but never populated, causing np.mean([]) to silently return NaN) - Add explicit dim=-1 to softmax call in ROC curve plotting to match the correct usage elsewhere and suppress deprecation warning - Replace hardcoded W&B entity "_archil" with configurable --entity CLI arg that falls back to WANDB_ENTITY env var, allowing other contributors to use their own W&B accounts

Copilot

Pull request overview

Fixes incorrect and/or unusable evaluation logging in DeepLense_Classification_Transformers_Archil_Srivastava by addressing missing metric computation, a PyTorch softmax API misuse, and hardcoded W&B configuration.

Changes:

Compute and log micro_auroc during evaluation (previously always NaN due to an empty list).
Specify dim=-1 in the ROC softmax call to avoid deprecated/ambiguous behavior.
Replace hardcoded W&B entity with a configurable --entity CLI argument (defaulting to WANDB_ENTITY).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
DeepLense_Classification_Transformers_Archil_Srivastava/eval.py	Adds missing `micro_auroc`, fixes `softmax(..., dim=-1)`, and makes W&B entity configurable for evaluation runs.
DeepLense_Classification_Transformers_Archil_Srivastava/train.py	Makes W&B entity configurable for training runs (removes hardcoded entity).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -84,6 +85,7 @@ def evaluate(model, data_loader, loss_fn, device):
    # Wandb-specific params
    parser.add_argument("--runid", type=str, help="ID of train run")


Copilot AI review requested due to automatic review settings May 7, 2026 08:31

Copilot started reviewing on behalf of rixav77 May 7, 2026 08:32 View session

Copilot AI reviewed May 7, 2026

View reviewed changes

Comment thread DeepLense_Classification_Transformers_Archil_Srivastava/eval.py

@@ -84,6 +85,7 @@ def evaluate(model, data_loader, loss_fn, device):

# Wandb-specific params

parser.add_argument("--runid", type=str, help="ID of train run")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: resolve evaluation metrics bugs in Classification_Transformers#229

fix: resolve evaluation metrics bugs in Classification_Transformers#229
rixav77 wants to merge 1 commit into
ML4SCI:mainfrom
rixav77:fix/classification-transformers-eval-metrics

rixav77 commented May 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -84,6 +85,7 @@ def evaluate(model, data_loader, loss_fn, device):
		# Wandb-specific params
		parser.add_argument("--runid", type=str, help="ID of train run")

Conversation

rixav77 commented May 7, 2026

Summary

Bug 1: micro_auroc always NaN

Bug 2: Missing softmax dimension

Bug 3: Hardcoded W&B entity

Changes

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Bug 1: `micro_auroc` always NaN

Bug 2: Missing `softmax` dimension