Skip to content

Fix dynamic top-k index scores#98

Merged
gabrieletijunaityte merged 16 commits into
developfrom
aligment_experiments
May 27, 2026
Merged

Fix dynamic top-k index scores#98
gabrieletijunaityte merged 16 commits into
developfrom
aligment_experiments

Conversation

@vdplasthijs
Copy link
Copy Markdown
Collaborator

What does this PR do?

  • Calculate theta_k for train only, and reuse for val and test.
  • Catch edge cases max = min leading to zero div
  • Printing optional, off by default
  • Elbow method removes nans and skips leading zeros for tighter distribution.
  • Update default concept caption v

Before submitting

  • Did you make sure title is self-explanatory and the description concisely explains the PR?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you test your PR locally with pytest command?

- Calculate theta_k for train only, and reuse for val and test.
- Catch edge cases max = min leading to zero div
- Printing optional, off by default
- Elbow method removes nans and skips leading zeros for tighter distribution.
- Update default concept caption v
@vdplasthijs
Copy link
Copy Markdown
Collaborator Author

All index scores should now be bound between -1 and 1.

Copy link
Copy Markdown
Contributor

@gabrieletijunaityte gabrieletijunaityte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you for the changes.

@gabrieletijunaityte
Copy link
Copy Markdown
Contributor

Do you think we could cache these elbow values as the data split is consistent? Or is the computation very quick?

@vdplasthijs
Copy link
Copy Markdown
Collaborator Author

@gabrieletijunaityte thanks! What do you mean by cached here? The values are only computed for the train split; then cached (in the dictionary) and retrieved for the other splits. See ‎src/models/text_alignment_model.py‎ line 134.

Comment thread src/models/text_alignment_model.py Outdated
@vdplasthijs
Copy link
Copy Markdown
Collaborator Author

Should be OK now @gabrieletijunaityte !

Comment thread src/models/text_alignment_model.py Outdated
vdplasthijs and others added 8 commits May 20, 2026 11:42
- Calculate thresholds on train set.
- Option (default True) to use stored thresholds if available.
- Option (default True) to save newly calculated values.
- Option (default False) to also compute new thresholds if old thresholds are available for some concepts.
v2 (old) has hand picked theta values.
v3 uses the values from v2 but adds baseline values.
v4 has newly calculated thresholds (using elbow).
@vdplasthijs
Copy link
Copy Markdown
Collaborator Author

@gabrieletijunaityte could you please review again? I made the following changes:

  • Three things need to be stored: theta_k and is_max (same for all splits), and n_baseline (different per split). These can now be stored by the caption builder by passing the concept_config object.
  • I thought about moving all the calculations to a separate pre-processing file .. I see the upside of cleaner code, but downside is that you would still need to load the data via a datamodule (to get the splits, to get the separate n_baseline), which then needs config files etc. I think it's easier (for new UCs) to just do it automatically, but only if needed. This is similar to how we handle creating data splits (also in datamodule). Hope that makes sense!
  • Now; it will only calculate if values are not present for all concepts, or if requested specifically even if they are present (default False). Thresholds are calculated using the training data only. :)
  • By default, if new values are calculated, they are stored in a new file (1 version up).
  • It checks whether the is_max parameter matches the data (so is_max=True when the minority of points is greater than theta_k). If it does not match a pre-set is_max, it drops the caption. (If no is_max is set, it calculates and adds it).

I think that's all! Needed more features than I initially thought but all works well now. The only thing that I haven't implemented is to look for uni- vs bimodal distributions and add both max and min accordingly. Not sure if it's worth it, but it's possible ..

@vdplasthijs
Copy link
Copy Markdown
Collaborator Author

@gabrieletijunaityte could you please review again? I have now:

  • Moved all these functionalities to data module.
  • Storing baseline accuracy instead of n baseline.
  • Fixed metric logging.

Copy link
Copy Markdown
Contributor

@gabrieletijunaityte gabrieletijunaityte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for all the iterations!

@gabrieletijunaityte gabrieletijunaityte merged commit 4bd13de into develop May 27, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants