Skip to content

perf(monorepo): cache pkg_strategy + pkg_tags, batch ls-remote, prune dead walks #504

@BryanFRD

Description

@BryanFRD

Concrete perf followups from the gix migration audit. Cumulative ~1.5-6s on mono-large.

Items

  • Cache `pkg_strategy` + `pkg_tags` per package (`src/monorepo/run/mod.rs:182-186, 253-256, 314-317, 563-566`). `pkg.effective_versioning(&workspace, &tags_for_package(&all_tags, &prefix))` is called 3–4× per package. `tags_for_package` is a linear scan over `all_tags` allocating a `Vec<&str>` each time. For 200 pkg × 4 calls × ~1000 tags = 800k iterations. Compute once at the top of the per-package loop and reuse. ~50–150ms saved on mono-large.

  • `is_package_touched` quadratic over changed files × shared paths (`src/monorepo/util.rs:77-108`). Per-package iteration over `changed_files` with `f.starts_with(prefix)`. For 200 pkg × 500 changed_files × ~3 shared_paths = 300k string compares. Pre-sort `changed_files` once, then `binary_search` the prefix or use `BTreeSet::range(prefix..)`. ~20–100ms saved when `recover_missed_releases=true`.

  • Cache parsed semver in TagIndexEntry (`src/git/tags.rs:93-125`). `find_highest_semver_tag` parses `semver::Version` on every entry in every call. For 200 pkg × ~1000 tags this is 200k re-parses. Add `parsed_version: Optionsemver::Version` to `TagIndexEntry`, populated once at `TagIndex::build` time. ~30–80ms saved.

  • Replace per-tag `git rev-list -n 1` in push_tags (`src/git/push.rs:13-23`, called in loop at line 186). On Windows each `CreateProcess` is 10-20ms. For 200 tags pushed: 2-4s. Resolve via gix: `repo.find_reference(&format!("refs/tags/{tag}")).peel_to_id_in_place()` (already have a `Repository` handle). ~1-4s saved on push-heavy releases.

  • Skip `collect_dirty_files` when no hook defined for the package (`src/monorepo/run/mod.rs:490, 672`). `collect_dirty_files` shells out to `git status --porcelain` once per package per hook context (2 contexts: post-bump + pre-commit). Gate by `resolve_hook(...).is_some()` — line 489 already does it for post-bump, line 671 doesn't. ~100ms–4s saved depending on hook usage.

  • Move OrphanedTagStrategy::Warn early-return in find_matching_commit (`src/git/tags.rs:225-265`). When strategy is Warn (default), the function does the 1000-commit walk then returns None at line 259. Early-return before the walk. ~5-30ms per orphan, irrelevant in healthy repos but free win.

  • Resolve `push_url` once and pass down (`src/git/push.rs:97, 172, 261, 288, 404` + `fetch.rs:10`). gix re-opens remotes on each call. ~5-15ms saved, trivial refactor.

Methodology

After each item, re-run `bench/micro` and `bench/competitive` on mono-large. Target: cumulative 100-300ms shaved off scan + 1-4s off push step. Track regressions on the micro benches that exercise these paths (`git_collect_tags`, `git_find_tag`, `full_check_flow`).

Related #471 (root perf N+1 audit), #476 (rayon parallelization — should land after these caching wins so the parallel work itself is cheaper).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementImprovement to existing feature

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions