Skip to content

[Klaud Cold] Update dsv4-fp4-b300-vllm vLLM image to v0.21.0#1456

Merged
functionstackx merged 3 commits into
mainfrom
update-dsv4-fp4-b300-vllm-v0.21.0
May 18, 2026
Merged

[Klaud Cold] Update dsv4-fp4-b300-vllm vLLM image to v0.21.0#1456
functionstackx merged 3 commits into
mainfrom
update-dsv4-fp4-b300-vllm-v0.21.0

Conversation

@functionstackx
Copy link
Copy Markdown
Collaborator

Summary

  • Bumps dsv4-fp4-b300-vllm from vllm/vllm-openai:v0.20.0-cu130 (18d old) to vllm/vllm-openai:v0.21.0.

Test plan

  • Full sweep passes with full-sweep-enabled label.

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

@github-actions
Copy link
Copy Markdown
Contributor

@functionstackx functionstackx changed the title Update dsv4-fp4-b300-vllm vLLM image to v0.21.0 [Klaud Cold] Update dsv4-fp4-b300-vllm vLLM image to v0.21.0 May 17, 2026
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward vLLM image version bump following the established pattern from PRs #1404/#1394/#1416.

Extended reasoning...

Overview

This PR bumps the dsv4-fp4-b300-vllm recipe from vllm/vllm-openai:v0.20.0-cu130 to vllm/vllm-openai:v0.21.0 in .github/configs/nvidia-master.yaml, and adds the corresponding entry to perf-changelog.yaml. It is a single-line config change plus a 6-line changelog append.

Security risks

None. This is a container image tag update for a benchmarking recipe. No auth, crypto, permissions, or executable code paths are touched. The upgrade target (vllm/vllm-openai:v0.21.0) is the same image already in use by sibling recipes merged in #1404, #1394, and #1416.

Level of scrutiny

Low. This PR follows an established mechanical pattern — recent commits c08dcef, c07bf5d, 958c62d, and 8b89206 are all version bumps of the same shape against the same vLLM image series. The full-sweep-enabled label means CI will exercise the change end-to-end before any production impact.

Other factors

The single inline finding is a nit on the (18d old) parenthetical in the new changelog entry — purely stylistic and doesn't block correctness. The shape of the change matches the historical convention closely, and the full-sweep gate provides a safety net for any runtime incompatibility with the new image.

Comment thread perf-changelog.yaml
- config-keys:
- dsv4-fp4-b300-vllm
description:
- "Update vLLM image from v0.20.0-cu130 (18d old) to v0.21.0"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The new perf-changelog entry contains the parenthetical "(18d old)" describing the prior image tag, which is anchored to the PR creation date and becomes meaningless when read later. All other entries in the file use the plain "Update X image from vA to vB" form — recommend dropping the "(18d old)" so this entry matches established style and remains useful as a permanent record.

Extended reasoning...

What's the issue

At perf-changelog.yaml:2636, the new entry reads:

- config-keys:
    - dsv4-fp4-b300-vllm
  description:
    - "Update vLLM image from v0.20.0-cu130 (18d old) to v0.21.0"
  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1456

The (18d old) parenthetical is an ephemeral relative-time annotation — it was true on the PR creation date (2026-05-17) but degrades immediately afterward. A reader looking at the changelog months from now has no implicit anchor for what "18d old" was measured from.

Why this conflicts with established style

Grepping for relative-time annotations across perf-changelog.yaml turns up exactly one match — the new entry from this PR. Every other entry, including the most recent neighbors, uses the bare form:

  • Line 2624: "Update vLLM image from v0.15.1 to v0.20.2"
  • Line 2630: "Update vLLM ROCm image from v0.18.0 to v0.21.0"

None of them include a freshness hint for the prior tag. The perf-changelog is a permanent record of recipe changes; ephemeral commentary anchored to a moving "now" doesn't fit that purpose.

Addressing the refutation

One verifier argued the absolute date can always be recovered from the pr-link field, so the staleness framing is overblown. That's true — a reader can click through to GitHub to recover the date. But that's not really the point: the entry is inconsistent with every other entry in the file, and "18d old" provides no information that the rest of the description (specific version strings) doesn't already convey. The cost to drop it is zero, and the result matches the file's established convention.

Impact

None functionally — this is purely a documentation/style issue. The changelog is human-readable metadata; it doesn't affect config parsing, image resolution, or sweep behavior. Flagging as nit accordingly.

Step-by-step proof

  1. Open perf-changelog.yaml at line 2636 — see "Update vLLM image from v0.20.0-cu130 (18d old) to v0.21.0".
  2. Look at the immediately preceding entry at line 2630 — "Update vLLM ROCm image from v0.18.0 to v0.21.0" (no relative-time hint).
  3. Look at line 2624 — "Update vLLM image from v0.15.1 to v0.20.2" (no relative-time hint).
  4. Grep the entire file for [0-9]+d (old|ago) — only the new entry matches.
  5. Imagine reading this changelog in 2027: "18d old" relative to what? Without an explicit anchor, the reader must check the linked PR's creation date — at which point the annotation conveys nothing the version strings don't already say.

Suggested fix

Drop the parenthetical so the entry reads:

- "Update vLLM image from v0.20.0-cu130 to v0.21.0"

@github-actions
Copy link
Copy Markdown
Contributor

@functionstackx
Copy link
Copy Markdown
Collaborator Author

/reuse-sweep-run

@functionstackx functionstackx merged commit 05e127b into main May 18, 2026
3 of 5 checks passed
@functionstackx functionstackx deleted the update-dsv4-fp4-b300-vllm-v0.21.0 branch May 18, 2026 16:21
@github-actions
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

1 participant