[EXPERIMENTAL]: Integrate cp-measure#982
Open
timtreis wants to merge 50 commits into
Open
Conversation
…to feature/add_cpmeasure
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…to feature/add_cpmeasure
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Member
Author
|
Note to self:
|
Member
|
Looking forward to this PR 👀🥸 |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Introduces _tiling.py with build_tile_specs() and extract_tile() that split a label image into overlapping tiles where each cell is assigned to exactly one tile by centroid. Non-owned cells are zeroed out so downstream processing never double-counts. Includes 31 tests: deterministic brick-pattern grid (touching and non-touching), coverage verification, and visual regression tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Member
Author
|
Refactoring in anticipation of afermg/cp_measure#38 being merged so we can upstream behaviour. |
This was referenced Apr 10, 2026
Wires the in-progress cp_measure.featurizer + lazy-tiling refactor onto a
working _tiling.py and closes out the six open notes on the PR.
_tiling.py:
* build_tile_specs now takes (shape, cell_info), so it is agnostic to
whether labels are in memory, dask-backed, or multiscale.
* compute_cell_info is public; new compute_cell_info_multiscale (read
coarsest scale, rescale to target) and compute_cell_info_tiled
(stream tiles, merge boundary-spanning cells via additive accumulators).
* extract_tile_lazy slices an xr.DataArray and materializes only the crop;
extract_tile retained for in-memory callers.
* verify_coverage takes a label_ids set.
_feature.py:
* Channel names: read via spatialdata.models.get_channel_names so c_coords
set at parse time flow through to output column suffixes.
* Progress: tqdm wrapper around joblib.Parallel(return_as='generator_unordered')
+ periodic logg.info('Tile {n}/{total} done (elapsed ...)') so non-TTY
runs (CI, slurm) also see progress.
* Alignment: _align_to_image_grid replaces the dim-mismatch raise with a
coordinate-system aware crop. Identity-or-integer-pixel-translation is
honored as a 1-to-1 pixel alignment; the overlap rectangle is processed
and out-of-extent cells are counted, not crashed on. Non-pixel-aligned
transforms either raise with a spatialdata.rasterize hint
(align_mode='strict', default) or trigger materialization via
spatialdata.rasterize (align_mode='rasterize') with a warning.
* DropReport: per-run counter for cells dropped due to extent, partial
boundary intersection, cp_measure no-data, or empty tiles. Emitted via
logg.info(report.summary()) at the end of every run.
Tests: 39 in test_tiling.py (was 30; new coverage for the lazy/multiscale
helpers + verify_coverage edge cases), 35 in test_calculate_image_features
including a TestPR982Concerns class with one regression test per open note.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* compute_cell_info_tiled: replace per-id np.where with scipy.ndimage.find_objects and np.bincount sums. One vectorized pass per tile instead of O(n_cells) scans. * _zero_non_owned: replace per-id rewrite loop with np.isin + np.where. * _classify_dropped_cells: drop the full-array .values + per-cell np.where; use compute_cell_info_tiled bboxes for inside/partial/outside classification, so the full label array is no longer materialized. * CellInfo: add bbox_y0/bbox_x0 fields so callers can do bbox math without reconstructing from the centroid (which is area-weighted, not bbox-centered). * _relabel_contiguous: replaced by skimage.segmentation.relabel_sequential. * _align_to_image_grid: flatten nested if/else with elif chain; extract _rasterize_to_image_grid so the shapes-key path and the align_mode='rasterize' path no longer duplicate the rasterize call. * DropReport: empty_tile_drop -> empty_tiles (the counter increments per tile). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
timtreis
added a commit
that referenced
this pull request
May 26, 2026
Five Sphinx warnings (treated as errors by ``-W``) on the docs build: - ``:mod:`spatialdata_plot``` and ``:meth:`spatialdata.SpatialData.pl.show``` have no intersphinx targets. - ``:class:`dask.distributed.Client``` has no intersphinx target either. - ``:func:`~squidpy.experimental.im.calculate_image_features``` pointed at a function that does not exist on this branch (planned for PR #982). Downgrade all five to plain double-backtick literals. Re-phrase the calculate_image_features reference to describe the shared tiling infrastructure (``squidpy.experimental.im._tiling``) without claiming a public function that has not yet shipped.
* _featurize_tile: accept a pre-built cp_config and drop the per-tile _build_cp_config rebuild. Config is now constructed once in calculate_image_features and reused across every tile (matters on 100kx100k images with thousands of tiles). * pyproject: cp-measure>=0.1.4 -> >=0.1.19 to pick up the granularity correctness fix (#44, #47), 3D-only feature filtering (#35), and static typing (#45). No upper cap left in place; bump when upstream ships a breaking release. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* pyproject: cp-measure>=0.1.19,<0.2 -- pre-1.0 dep, cap upper bound so a future 0.2.x release doesn't silently break installs. * _featurize_tile: cp_config is now keyword-only with a default of None and falls back to _build_cp_config when not supplied. Preserves the pre-hoist call signature for direct (test/notebook) callers while the caller-built reuse path in calculate_image_features still skips the per-tile rebuild. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Resolves add/add conflict on src/squidpy/experimental/im/_tiling.py between our cp_measure-driven lazy tiling refactor and PR #1157's tiling QC additions. Unified into one module: * Kept our superset definitions of CellInfo (with bbox_y0/bbox_x0 defaults), TileSpec, build_tile_specs((shape, cell_info, ...)), compute_cell_info, compute_cell_info_multiscale, compute_cell_info_tiled, extract_tile, extract_tile_lazy, verify_coverage, and the array-returning _zero_non_owned. * Added extract_labels_tile_lazy(labels_da, spec) -- the labels-only crop variant from main, needed by tl/_tiling_qc.py. Implemented on top of our _zero_non_owned return style. * __all__ now exports the new symbol. Auto-merge restored main's new files (tl/_tiling_qc.py, pl/_tiling_qc.py, conftest.py, tests/_images/TilingQCVisual_*.png, test_tiling_qc.py); our earlier deletion of the old tl/_tiling_qc.py no longer applies -- the new QC implementation supersedes it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Aligns with main's tl/_tiling_qc.py tests that already call ``compute_cell_info_tiled(labels_da, chunk_size=...)`` and with the numpy/dask convention. Internal body and our own test in tests/experimental/test_tiling.py updated accordingly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Convention in the scverse ecosystem is to address channels by name only. Passing an int now raises TypeError; passing a non-existent name still raises ValueError as before. A channel whose name happens to be the string "0" is still accepted -- the check discriminates on Python type (isinstance(ch, str)), not on the string contents. * _prepare_lazy: type hint list[str] | list[int] | None -> list[str] | None and add an isinstance(ch, str) guard before lookup. * calculate_image_features: same type hint update. * Docstring clarified that integer indices are not accepted. * test_channel_selection_by_index renamed to test_channel_selection_rejects_int and now asserts TypeError. * test_concern4_channel_subset_by_index renamed to ..._by_name and passes ["c0", "c2"] instead of [0, 2]. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This was referenced May 27, 2026
timtreis
pushed a commit
to timtreis/squidpy
that referenced
this pull request
May 27, 2026
- Trim DropReport to `empty_tiles` (the only field this code ever increments) - Narrow `align_mode` Literal to "strict" and add a runtime guard that catches dynamic callers passing other values - Exclude `cpmeasure:*` flag names from the "available" list in the unknown-feature error (they always raise NotImplementedError; listing them as available is misleading) - Raise on ambiguous mixes of `skimage:label` with `skimage:label:<prop>` (and same for `skimage:label+image`); the previous order-dependent behaviour silently expanded the narrowing form - Collapse `pd.concat -> replace([inf,-inf],0) -> fillna(0) -> .values.astype(float32)` into a single numpy pass via `np.nan_to_num`; saves two full-table copies - Use `pd.Categorical.from_codes` for the region column to avoid allocating an N-element Python list for a one-level categorical - Hoist the labels_key/shapes_key XOR pick to one local - Add `experimental.im.calculate_image_features` to docs/api.md Also: remove all references to the multi-PR split (follow-up PRs, PR-2, PR-4, "in this PR", "cp_measure-as-default behaviour from PR scverse#982", TestPR982Concerns, "Concern N" markers) from code and tests.
timtreis
pushed a commit
to timtreis/squidpy
that referenced
this pull request
May 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Associated notebook: https://github.com/scverse/squidpy_notebooks/blob/add_cpmeasure_notebook/tutorials/tutorial_cpmeasure.ipynb