Update dsr1-fp8-b300-sglang and -mtp SGLang image to v0.5.12-cu130#1419
Conversation
… to v0.5.12-cu130 Ref #1154 Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
| - config-keys: | ||
| - dsr1-fp8-b300-sglang | ||
| - dsr1-fp8-b300-sglang-mtp | ||
| description: | ||
| - "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130" | ||
| pr-link: XXX |
There was a problem hiding this comment.
🔴 The new perf-changelog.yaml entry has two issues: (1) pr-link: XXX is left as a placeholder instead of https://github.com/SemiAnalysisAI/InferenceX/pull/1419, breaking the file's convention; (2) the description "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130" is only correct for dsr1-fp8-b300-sglang — dsr1-fp8-b300-sglang-mtp was on v0.5.10.post1-cu130 per the nvidia-master.yaml diff. Consider splitting into two entries or noting both source versions (e.g. "from v0.5.11-cu130 and v0.5.10.post1-cu130 respectively").
Extended reasoning...
Issue 1: Placeholder pr-link: XXX
Line 2557 of perf-changelog.yaml sets:
pr-link: XXXEvery other entry in the file uses the fully-qualified GitHub URL form, e.g. lines 2532, 2538, 2544, 2550 all use https://github.com/SemiAnalysisAI/InferenceX/pull/NNN. AGENTS.md line 123 even prescribes the template pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX where XXX is meant to be replaced with the actual PR number — here, 1419. The literal string XXX was clearly not substituted before commit.
Impact: any downstream tooling that follows pr-link as a URL will either error out or produce a garbage link, and human readers can no longer trace this changelog entry back to its source PR.
Fix: replace pr-link: XXX with pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1419.
Issue 2: Wrong source version for dsr1-fp8-b300-sglang-mtp
The new changelog entry groups two configs under a single description:
- config-keys:
- dsr1-fp8-b300-sglang
- dsr1-fp8-b300-sglang-mtp
description:
- "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"But the nvidia-master.yaml diff in this same PR shows two different source versions:
| config-key | old image | new image |
|---|---|---|
dsr1-fp8-b300-sglang |
v0.5.11-cu130 |
v0.5.12-cu130 |
dsr1-fp8-b300-sglang-mtp |
v0.5.10.post1-cu130 |
v0.5.12-cu130 |
The PR description itself confirms this asymmetry: it explicitly says "dsr1-fp8-b300-sglang (from v0.5.11-cu130) and dsr1-fp8-b300-sglang-mtp (from v0.5.10.post1-cu130)". So the changelog description is only correct for the first config and is misleading for the -mtp variant, which jumped two minor versions, not one.
Step-by-step proof
- Open
.github/configs/nvidia-master.yamlat thedsr1-fp8-b300-sglangblock (around line 1936). Diff:image: lmsysorg/sglang:v0.5.11-cu130→image: lmsysorg/sglang:v0.5.12-cu130. Source version: v0.5.11-cu130. - Open the same file at the
dsr1-fp8-b300-sglang-mtpblock (around line 2568). Diff:image: lmsysorg/sglang:v0.5.10.post1-cu130→image: lmsysorg/sglang:v0.5.12-cu130. Source version: v0.5.10.post1-cu130 — different from the first. - Open
perf-changelog.yamllines 2552–2557. The single description string says "from v0.5.11-cu130 to v0.5.12-cu130" and applies to both config-keys above. For-mtp, this is wrong — the actual prior image wasv0.5.10.post1-cu130. - Same entry, line 2557:
pr-link: XXXinstead ofhttps://github.com/SemiAnalysisAI/InferenceX/pull/1419.
Suggested fix
Either split into two entries:
- config-keys:
- dsr1-fp8-b300-sglang
description:
- "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1419
- config-keys:
- dsr1-fp8-b300-sglang-mtp
description:
- "Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1419…or keep a single entry and clarify both source versions in the description, e.g. "Update SGLang image to v0.5.12-cu130 (from v0.5.11-cu130 and v0.5.10.post1-cu130 respectively)", plus the real pr-link URL.
| # B200 SGLang MTP recipe as-is until B300-specific tuning is available. Image bumped | ||
| # to v0.5.10.post1-cu130 to match the standard B300 SGLang image used by other B300 configs. |
There was a problem hiding this comment.
🟡 The NOTE comment above dsr1-fp8-b300-sglang-mtp at lines 2568-2569 still says "Image bumped to v0.5.10.post1-cu130 to match the standard B300 SGLang image used by other B300 configs." but this PR bumps the image on the next line to v0.5.12-cu130, leaving the comment stale and self-contradictory. Either update the version to v0.5.12-cu130, or drop the specific version reference (the sister dsr1-fp8-b300-sglang block at lines 1934-1937 avoids naming a specific version).
Extended reasoning...
What the bug is
The block of comments at .github/configs/nvidia-master.yaml:2566-2569 documents why dsr1-fp8-b300-sglang-mtp reuses the B200 recipe and notes the specific image version it was pinned to:
# B200 SGLang MTP recipe as-is until B300-specific tuning is available. Image bumped
# to v0.5.10.post1-cu130 to match the standard B300 SGLang image used by other B300 configs.
dsr1-fp8-b300-sglang-mtp:
image: lmsysorg/sglang:v0.5.12-cu130The comment explicitly states "Image bumped to v0.5.10.post1-cu130", but the very next line now pins the image to v0.5.12-cu130. The doc claim and the actual value disagree.
The code path that triggers it
This is purely a documentation/comment correctness issue — there is no runtime path. The contradiction is visible to anyone reading the YAML file: the rationale comment names a version that no longer appears anywhere in the file.
Why existing code doesn't prevent it
YAML comments are free text and aren't validated against the keys/values they describe. There's no linter or test that cross-checks comment contents against image tags, so a stale rationale comment slips through silently when an image is bumped.
Step-by-step proof
- Open
.github/configs/nvidia-master.yaml. - Jump to line 2568. The comment reads:
# to v0.5.10.post1-cu130 to match the standard B300 SGLang image used by other B300 configs. - Read the next non-comment line (2571):
image: lmsysorg/sglang:v0.5.12-cu130. - Grep the file for
v0.5.10.post1-cu130— it appears only inside the now-stale comment; no image actually uses that version anymore. - Compare with the sister block at lines 1934-1937 for
dsr1-fp8-b300-sglang, whose comment avoids naming a specific version and therefore stays correct across image bumps.
Impact
No runtime impact. The harm is to readers/maintainers: anyone trying to understand why this MTP config uses a particular image version is led to a version that no longer exists in the file, undermining trust in the surrounding comments and making future bumps more error-prone.
How to fix
Either:
- Update line 2568-2569 to reference
v0.5.12-cu130, or - Drop the version-specific phrasing (e.g., "Image bumped to match the standard B300 SGLang image used by other B300 configs.") so the comment doesn't need to be re-edited on every version bump — matching the style already used by
dsr1-fp8-b300-sglang. The second option is preferable because it prevents the same drift from recurring on the next bump.
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25980015344 |
|
/reuse-sweep-run |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25982609731 |
Updates SGLang image for
dsr1-fp8-b300-sglang(from v0.5.11-cu130) anddsr1-fp8-b300-sglang-mtp(from v0.5.10.post1-cu130) to v0.5.12-cu130.\nRef #1154
Generated with Claude Code