Skip to content

Implement Lazy Cachi2 Proxy for on-demand local builds #3351

@jiridanek

Description

@jiridanek

Problem

While prefetch-all.sh guarantees offline reproducibility for hermetic builds, running it locally requires downloading gigabytes of dependencies (pip wheels, RPMs, NPM tarballs) upfront before podman build can even start. Even with architecture filtering, this upfront tax degrades the local developer experience.

Proposed Solution (Phase 2: Long-Term Local Dev)

Implement a Lazy Cachi2 Proxy Daemon. Instead of downloading everything upfront to cachi2/output/, we run a lightweight local userspace server that intercepts the container's file requests and fetches dependencies from the internet on-demand.

Architectural Strategy

  1. The Transport Layer:
    • Option A (FUSE via Privileged Container): Run a FUSE daemon in a privileged container on the Podman Machine (Linux VM). This bypasses the need for developers to install macFUSE (and reboot into macOS Recovery Mode) while providing a 100% POSIX-compliant local mount to the build container.
    • Option B (Userspace NFS Server): Run an unprivileged Go NFS server (willscott/go-nfs) on the host. Podman mounts it directly via --mount type=volume,volume-opt=type=nfs.
  2. Language: Go (e.g., hanwen/go-fuse or go-nfs) or Rust (fuser) are highly recommended for handling high-concurrency FUSE/NFS translation and HTTP streaming safely.
  3. The VFS (Virtual File System): At startup, the daemon parses the lockfiles (pylock.toml, package-lock.json, rpms.lock.yaml) and builds an in-memory tree. stat() and readdir() calls return instantly using the pre-calculated sizes and checksums.
  4. Demand-Driven Fetching (Blocking Open): We do not need to predict the dependency graph. When the container executes RUN pip install and issues an open() syscall for numpy.whl, the daemon blocks the thread, downloads the file to a persistent host cache, and then returns success. The container build simply sees a file that took a few seconds to open.
  5. Concurrency (Multi-Build Support): To support concurrent podman build instances, each build gets its own lightweight proxy instance. They share a global host cache (e.g., /var/tmp/lazy_cachi2_store). We use advisory filesystem locking (flock) on the cache files to prevent redundant downloads across concurrent daemons, falling back to atomic renames for corruption safety.

Exceptions

RPMs: dnf requires repodata/ indices upfront and won't fetch lazily. The proxy will still download RPMs upfront to run createrepo_c, but it will hash the lockfile to cache the repodata/ directory. For 99% of local rebuilds, the lockfile hash won't change, resulting in an instant cache hit and bypassing the RPM prefetch entirely.

cc @coderabbitai

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    📋 Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions