Software Gardening Part 2 — v0.5.1 maintenance release by gwaybio · Pull Request #368 · cytomining/DeepProfiler

gwaybio · 2026-06-12T17:35:45Z

Removed dead code. We deleted the crop_generators/ plugin dispatch system (replaced by a direct import of SingleImageCropGenerator), the dataset/sampling.py single-cell export module, training-related methods from CropGenerator, and the comet_ml/dead imports from main.py
Simplified profiling. Profile now directly instantiates SingleImageCropGenerator instead of going through dynamic importlib dispatch; checkpoint loading always renames the Dense head and loads backbone weights by name, removing a fragile try/except
Bolstered documentation. We added module docstrings and method docstrings to profiling.py, cropping.py, and all retained dataset/ and imaging/ modules
Updated README. We added sections covering how inference works, image preparation (illumination correction + compression), large-scale parallel profiling with split, and verifying the installation
Updated ROADMAP and AGENTS.md. We aligned version numbers (v0.5.1 current, v0.6.x PyTorch rewrite), described the agentic AI vision, and documented scope expectations for contributors
Reformatted all markdown files to one sentence per line
Dependency fixes. Very narrow, fragile range but it works. We pinned numpy<2.0 to fix TF 2.14 / ml-dtypes ABI incompatibility; added tensorflow-macos as an explicit platform-conditional dependency so uv sync --frozen works on macOS arm64 CI runners
Integration test suite. We added @pytest.mark.integration tests that download the Cell Painting CNN v1 checkpoint from Zenodo, verify the 1280-d pool5 embedding, and run the full Profile pipeline end-to-end; added a dedicated zenodo-integration-test CI job that runs these separately from the main matrix

ty was causing too many issues for this task

gwaybio · 2026-06-12T17:36:15Z

@d33bs - please ignore for now

ml-dtypes 0.2.0 (a TF 2.14 dep) was compiled against the numpy 1.x C ABI; numpy 2.x changed the ABI and broke bfloat16() at import time. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

uv generates lockfiles at lock time on the resolving platform; the tensorflow meta-package's conditional dep on tensorflow-macos was never added because the lock was generated on Linux. Declaring it explicitly with a platform marker forces uv to include it in the cross-platform lockfile so uv sync --frozen works on macOS CI runners. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

build_model names the GlobalAveragePooling2D layer "pool5"; the test config still had the old name "features" from before the refactor. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Runs pytest -m integration on ubuntu-24.04/Python 3.11 only, after the main matrix passes. Downloads the Cell Painting CNN v1 checkpoint from Zenodo and validates the full profiling pipeline end-to-end. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ert 1280-d embedding - Build model with num_classes=490 to match the actual Zenodo checkpoint so load_weights genuinely validates all weights including the head - Add train.validation.batch_size to zenodo_config; required by SingleImageCropGenerator.start() - Assert features.shape == (2, 1280) to explicitly verify the pool5 embedding dimension Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

addopts in pyproject.toml applies globally; without overriding it the job was running with conflicting -m flags. Clear addopts so the job runs only the three @pytest.mark.integration tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The try/except approach assumed strict loading could succeed, but the Dense classifier head will virtually never match the user's class count. Always renaming it to 'classifier' before load_weights(by_name=True) means the checkpoint's Dense head is unconditionally skipped and only backbone weights are loaded — which is all profiling needs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

gwaybio · 2026-06-16T16:28:54Z

@d33bs - this is now ready for your eyes. This was a much heavier lift than I naively expected (see my first failed attempt #367 ) Essentially, for any DeepProfiler PyPI release to be valid, we needed to overhaul the project sooner than I had estimated. See above for updated PR description

gwaybio · 2026-06-16T16:31:04Z

@jccaicedo - tagging you for visibility. If you're able, we'd benefit from your comments as well. Thanks!

d33bs

Nice! Looking good to me.

d33bs · 2026-06-16T20:51:19Z


  python-type-checks:
-    name: Python type checks (ty)
+    name: Python type checks (mypy)


Looks like we slid back to mypy. I don't suggest this as a sustainable path forward, there are just too many edge cases which mypy forces.

i was unimpressed by ty! It was providing many false positives, which mypy avoided. I'll drop it now though, heeding your warnings.

d33bs · 2026-06-16T20:53:30Z

        if: (matrix.os != 'ubuntu-24.04') || (matrix.python-version != env.TARGET_PYTHON_VERSION)
        run: uv run pytest ${{ github.event.inputs.pytest_addopts }}

+  zenodo-integration-test:


Keeping this clear (Zenodo archive integration issues are top of mind for me and initially misread).

Suggested change

zenodo-integration-test:

zenodo-model-integration-test:

d33bs · 2026-06-16T20:56:06Z

- Python 3.10+ is required.
- Linux (Ubuntu 20.04+) or macOS.
- For GPU acceleration, a CUDA-compatible GPU is recommended.
+- Python 3.10 or 3.11


Later Python version compatibility should be a priority for us on this project soon, Python 3.10 will be EoL this year, meaning we only have one version of compatibility afterwards.

yes, agree... if we were a bit later in revitalizing this project, we would have had to use a different strategy given the narrowing window of dependency support

d33bs · 2026-06-16T21:01:20Z

 class IlluminationStatistics():
+    """Accumulate pixel statistics across all images in one plate.
+
+    Builds a per-channel pixel histogram and a mean image (at downscaled


This sounds useful!

yes! I meant to tag @jenna-tomkinson 👀

WHOA! I didn't realize DeepProfiler offered illumination correction. I think this is something we can test to see the performance compared to CellProfiler, especially since this looks to be fairly parameter-free. I wonder why this is kept under DeepProfiler though, feels like this could be an independent software for IC in cytomining, but just my first impression.

d33bs · 2026-06-16T21:03:57Z

-    # Read metadata and split dataset in training and validation
+
+def read_dataset(config, mode='train'):
+    """Build an :class:`ImageDataset` from a config dict.


Seeing the missing docstrings replaced: consider adding the pydocstring ruff linter config to the pyproject.toml.

thanks, added!

d33bs · 2026-06-16T21:10:10Z

 - **Do not modify deprecated code** unless fixing a bug that affects the current release.
  Deprecated paths are: `deepprofiler/learning/`, `deepprofiler/imaging/augmentations.py`, `deepprofiler/imaging/cropping.py`, `deepprofiler/dataset/sampling.py`, `plugins/`.
- **Do not anticipate the v0.4.x rewrite** in v0.3.x changes.
+- **Do not anticipate the v0.6.x PyTorch rewrite** in v0.5.1 changes.


I feel like we should avoid version specifics in the agents file. Perhaps this is better suited for the roadmap with strong guidance here to abide what's in the roadmap as per existing git tags and upstream repo status?

yep, i agree - i'll update this file

d33bs · 2026-06-16T21:11:07Z

+    useful for fine-tuning experiments.
+    """
+    num = config["train"]["model"]["params"]["conv_blocks"]
+    assert num in _EFFICIENTNET_MODELS, f"{num} not in supported EfficientNet variants: {list(_EFFICIENTNET_MODELS)}"


Consider removing these asserts and using if-cases with specific Exceptions instead.

Also, we can include another ruff linter rule to avoid these: https://docs.astral.sh/ruff/rules/assert

added, thanks!!

d33bs · 2026-06-16T21:13:17Z

+    skimage.io.imsave(tmp_path + "/rand_img_1.png", numpy.random.randint(256, size=(16, 16), dtype=numpy.uint16))
+    skimage.io.imsave(tmp_path + "/rand_img_2.png", numpy.random.randint(256, size=(16, 16), dtype=numpy.uint16))
+    skimage.io.imsave(tmp_path + "/rand_img_3.png", numpy.random.randint(256, size=(16, 16), dtype=numpy.uint16))
+    skimage.io.imsave(tmp_path + "/rand_outlines.png", numpy.random.randint(256, size=(16, 16), dtype=numpy.uint16))


Is there a way to test a real dataset here too? Perhaps something from the publication? It need not be a "full dataset" but having something "real" would probably give us better observation of the code changes. Maybe this is something to think about including in OME-IRIS too, eventually.

added in ae4db39 - thanks for this suggestion!

gwaybio · 2026-06-16T22:34:26Z

thanks for the review @d33bs ! I'll keep this PR open for ~1 week to give @jccaicedo a chance to comment. Juan, please let us know if you need more time!

gwaybio added 2 commits June 12, 2026 11:11

switch to mypy

cceb8c4

ty was causing too many issues for this task

move to dynamic versioning and add uv

fabf0a1

gwaybio marked this pull request as ready for review June 12, 2026 17:36

gwaybio requested a review from d33bs as a code owner June 12, 2026 17:36

gwaybio and others added 24 commits June 16, 2026 05:27

fix pandas bugs

3e1d24b

remove plugins

e53a831

drop learning module

966d227

refactor profiling and crop generation

aaf62bd

png

f8c6639

delete augmentations (leftover training)

b3925a9

remove training artifacts

53e5815

pin narrow

5b2b1ca

add test for zenodo checkpoint use

4e9abd9

specify integration test

4ac5364

specify new v0.5.1 details

aeaded0

bolster docs

2e4e8a7

update markdown files

5539d70

remove dedicated crop generator

c09ba52

remove dedicated sampling

6c5a429

bolster docs and update tests for refactor

66954b1

pin numpy<2.0 to fix TensorFlow 2.14 compatibility

eeb94a4

ml-dtypes 0.2.0 (a TF 2.14 dep) was compiled against the numpy 1.x C ABI; numpy 2.x changed the ABI and broke bfloat16() at import time. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix tests

ce35c67

fix test config feature_layer name to match build_model

512b2ce

build_model names the GlobalAveragePooling2D layer "pool5"; the test config still had the old name "features" from before the refactor. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

gwaybio changed the title ~~More software gardening and rescoping~~ Software Gardening Part 2 — v0.5.1 maintenance release Jun 16, 2026

d33bs approved these changes Jun 16, 2026

View reviewed changes

gwaybio added 5 commits June 16, 2026 16:00

add pydocstyle

85c46f8

update docstrings

335525e

respond to review

a6303b1

drop mypy

41046aa

add real data test

ae4db39

Conversation

gwaybio commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gwaybio commented Jun 12, 2026

Uh oh!

gwaybio commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gwaybio commented Jun 16, 2026

Uh oh!

d33bs left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gwaybio commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gwaybio commented Jun 12, 2026 •

edited

Loading

gwaybio commented Jun 16, 2026 •

edited

Loading

gwaybio commented Jun 16, 2026 •

edited

Loading