Software Gardening Part 2 — v0.5.1 maintenance release#368
Conversation
ty was causing too many issues for this task
|
@d33bs - please ignore for now |
ml-dtypes 0.2.0 (a TF 2.14 dep) was compiled against the numpy 1.x C ABI; numpy 2.x changed the ABI and broke bfloat16() at import time. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
uv generates lockfiles at lock time on the resolving platform; the tensorflow meta-package's conditional dep on tensorflow-macos was never added because the lock was generated on Linux. Declaring it explicitly with a platform marker forces uv to include it in the cross-platform lockfile so uv sync --frozen works on macOS CI runners. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
build_model names the GlobalAveragePooling2D layer "pool5"; the test config still had the old name "features" from before the refactor. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Runs pytest -m integration on ubuntu-24.04/Python 3.11 only, after the main matrix passes. Downloads the Cell Painting CNN v1 checkpoint from Zenodo and validates the full profiling pipeline end-to-end. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ert 1280-d embedding - Build model with num_classes=490 to match the actual Zenodo checkpoint so load_weights genuinely validates all weights including the head - Add train.validation.batch_size to zenodo_config; required by SingleImageCropGenerator.start() - Assert features.shape == (2, 1280) to explicitly verify the pool5 embedding dimension Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
addopts in pyproject.toml applies globally; without overriding it the job was running with conflicting -m flags. Clear addopts so the job runs only the three @pytest.mark.integration tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The try/except approach assumed strict loading could succeed, but the Dense classifier head will virtually never match the user's class count. Always renaming it to 'classifier' before load_weights(by_name=True) means the checkpoint's Dense head is unconditionally skipped and only backbone weights are loaded — which is all profiling needs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@jccaicedo - tagging you for visibility. If you're able, we'd benefit from your comments as well. Thanks! |
|
|
||
| python-type-checks: | ||
| name: Python type checks (ty) | ||
| name: Python type checks (mypy) |
There was a problem hiding this comment.
Looks like we slid back to mypy. I don't suggest this as a sustainable path forward, there are just too many edge cases which mypy forces.
There was a problem hiding this comment.
i was unimpressed by ty! It was providing many false positives, which mypy avoided. I'll drop it now though, heeding your warnings.
| if: (matrix.os != 'ubuntu-24.04') || (matrix.python-version != env.TARGET_PYTHON_VERSION) | ||
| run: uv run pytest ${{ github.event.inputs.pytest_addopts }} | ||
|
|
||
| zenodo-integration-test: |
There was a problem hiding this comment.
Keeping this clear (Zenodo archive integration issues are top of mind for me and initially misread).
| zenodo-integration-test: | |
| zenodo-model-integration-test: |
| - Python 3.10+ is required. | ||
| - Linux (Ubuntu 20.04+) or macOS. | ||
| - For GPU acceleration, a CUDA-compatible GPU is recommended. | ||
| - Python 3.10 or 3.11 |
There was a problem hiding this comment.
Later Python version compatibility should be a priority for us on this project soon, Python 3.10 will be EoL this year, meaning we only have one version of compatibility afterwards.
There was a problem hiding this comment.
yes, agree... if we were a bit later in revitalizing this project, we would have had to use a different strategy given the narrowing window of dependency support
| class IlluminationStatistics(): | ||
| """Accumulate pixel statistics across all images in one plate. | ||
|
|
||
| Builds a per-channel pixel histogram and a mean image (at downscaled |
There was a problem hiding this comment.
WHOA! I didn't realize DeepProfiler offered illumination correction. I think this is something we can test to see the performance compared to CellProfiler, especially since this looks to be fairly parameter-free. I wonder why this is kept under DeepProfiler though, feels like this could be an independent software for IC in cytomining, but just my first impression.
| # Read metadata and split dataset in training and validation | ||
|
|
||
| def read_dataset(config, mode='train'): | ||
| """Build an :class:`ImageDataset` from a config dict. |
There was a problem hiding this comment.
Seeing the missing docstrings replaced: consider adding the pydocstring ruff linter config to the pyproject.toml.
| - **Do not modify deprecated code** unless fixing a bug that affects the current release. | ||
| Deprecated paths are: `deepprofiler/learning/`, `deepprofiler/imaging/augmentations.py`, `deepprofiler/imaging/cropping.py`, `deepprofiler/dataset/sampling.py`, `plugins/`. | ||
| - **Do not anticipate the v0.4.x rewrite** in v0.3.x changes. | ||
| - **Do not anticipate the v0.6.x PyTorch rewrite** in v0.5.1 changes. |
There was a problem hiding this comment.
I feel like we should avoid version specifics in the agents file. Perhaps this is better suited for the roadmap with strong guidance here to abide what's in the roadmap as per existing git tags and upstream repo status?
There was a problem hiding this comment.
yep, i agree - i'll update this file
| useful for fine-tuning experiments. | ||
| """ | ||
| num = config["train"]["model"]["params"]["conv_blocks"] | ||
| assert num in _EFFICIENTNET_MODELS, f"{num} not in supported EfficientNet variants: {list(_EFFICIENTNET_MODELS)}" |
There was a problem hiding this comment.
Consider removing these asserts and using if-cases with specific Exceptions instead.
There was a problem hiding this comment.
Also, we can include another ruff linter rule to avoid these: https://docs.astral.sh/ruff/rules/assert
| skimage.io.imsave(tmp_path + "/rand_img_1.png", numpy.random.randint(256, size=(16, 16), dtype=numpy.uint16)) | ||
| skimage.io.imsave(tmp_path + "/rand_img_2.png", numpy.random.randint(256, size=(16, 16), dtype=numpy.uint16)) | ||
| skimage.io.imsave(tmp_path + "/rand_img_3.png", numpy.random.randint(256, size=(16, 16), dtype=numpy.uint16)) | ||
| skimage.io.imsave(tmp_path + "/rand_outlines.png", numpy.random.randint(256, size=(16, 16), dtype=numpy.uint16)) |
There was a problem hiding this comment.
Is there a way to test a real dataset here too? Perhaps something from the publication? It need not be a "full dataset" but having something "real" would probably give us better observation of the code changes. Maybe this is something to think about including in OME-IRIS too, eventually.
There was a problem hiding this comment.
added in ae4db39 - thanks for this suggestion!
|
thanks for the review @d33bs ! I'll keep this PR open for ~1 week to give @jccaicedo a chance to comment. Juan, please let us know if you need more time! |
Reformatted all markdown files to one sentence per line