[Klaud Cold] Update dsv4-fp4-b300-vllm vLLM image to v0.21.0#1456
Conversation
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26006196249 |
There was a problem hiding this comment.
LGTM — straightforward vLLM image version bump following the established pattern from PRs #1404/#1394/#1416.
Extended reasoning...
Overview
This PR bumps the dsv4-fp4-b300-vllm recipe from vllm/vllm-openai:v0.20.0-cu130 to vllm/vllm-openai:v0.21.0 in .github/configs/nvidia-master.yaml, and adds the corresponding entry to perf-changelog.yaml. It is a single-line config change plus a 6-line changelog append.
Security risks
None. This is a container image tag update for a benchmarking recipe. No auth, crypto, permissions, or executable code paths are touched. The upgrade target (vllm/vllm-openai:v0.21.0) is the same image already in use by sibling recipes merged in #1404, #1394, and #1416.
Level of scrutiny
Low. This PR follows an established mechanical pattern — recent commits c08dcef, c07bf5d, 958c62d, and 8b89206 are all version bumps of the same shape against the same vLLM image series. The full-sweep-enabled label means CI will exercise the change end-to-end before any production impact.
Other factors
The single inline finding is a nit on the (18d old) parenthetical in the new changelog entry — purely stylistic and doesn't block correctness. The shape of the change matches the historical convention closely, and the full-sweep gate provides a safety net for any runtime incompatibility with the new image.
| - config-keys: | ||
| - dsv4-fp4-b300-vllm | ||
| description: | ||
| - "Update vLLM image from v0.20.0-cu130 (18d old) to v0.21.0" |
There was a problem hiding this comment.
🟡 The new perf-changelog entry contains the parenthetical "(18d old)" describing the prior image tag, which is anchored to the PR creation date and becomes meaningless when read later. All other entries in the file use the plain "Update X image from vA to vB" form — recommend dropping the "(18d old)" so this entry matches established style and remains useful as a permanent record.
Extended reasoning...
What's the issue
At perf-changelog.yaml:2636, the new entry reads:
- config-keys:
- dsv4-fp4-b300-vllm
description:
- "Update vLLM image from v0.20.0-cu130 (18d old) to v0.21.0"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1456The (18d old) parenthetical is an ephemeral relative-time annotation — it was true on the PR creation date (2026-05-17) but degrades immediately afterward. A reader looking at the changelog months from now has no implicit anchor for what "18d old" was measured from.
Why this conflicts with established style
Grepping for relative-time annotations across perf-changelog.yaml turns up exactly one match — the new entry from this PR. Every other entry, including the most recent neighbors, uses the bare form:
- Line 2624:
"Update vLLM image from v0.15.1 to v0.20.2" - Line 2630:
"Update vLLM ROCm image from v0.18.0 to v0.21.0"
None of them include a freshness hint for the prior tag. The perf-changelog is a permanent record of recipe changes; ephemeral commentary anchored to a moving "now" doesn't fit that purpose.
Addressing the refutation
One verifier argued the absolute date can always be recovered from the pr-link field, so the staleness framing is overblown. That's true — a reader can click through to GitHub to recover the date. But that's not really the point: the entry is inconsistent with every other entry in the file, and "18d old" provides no information that the rest of the description (specific version strings) doesn't already convey. The cost to drop it is zero, and the result matches the file's established convention.
Impact
None functionally — this is purely a documentation/style issue. The changelog is human-readable metadata; it doesn't affect config parsing, image resolution, or sweep behavior. Flagging as nit accordingly.
Step-by-step proof
- Open
perf-changelog.yamlat line 2636 — see"Update vLLM image from v0.20.0-cu130 (18d old) to v0.21.0". - Look at the immediately preceding entry at line 2630 —
"Update vLLM ROCm image from v0.18.0 to v0.21.0"(no relative-time hint). - Look at line 2624 —
"Update vLLM image from v0.15.1 to v0.20.2"(no relative-time hint). - Grep the entire file for
[0-9]+d (old|ago)— only the new entry matches. - Imagine reading this changelog in 2027: "18d old" relative to what? Without an explicit anchor, the reader must check the linked PR's creation date — at which point the annotation conveys nothing the version strings don't already say.
Suggested fix
Drop the parenthetical so the entry reads:
- "Update vLLM image from v0.20.0-cu130 to v0.21.0"|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26006198426 |
|
/reuse-sweep-run |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26045980093 |
Summary
dsv4-fp4-b300-vllmfromvllm/vllm-openai:v0.20.0-cu130(18d old) tovllm/vllm-openai:v0.21.0.Test plan
full-sweep-enabledlabel.🤖 Generated with Claude Code