Update glm5-fp4-b200-sglang and -mtp SGLang image to v0.5.12-cu130#1417
Conversation
… to v0.5.12-cu130 Ref #1154 Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
| - glm5-fp4-b200-sglang-mtp | ||
| description: | ||
| - "Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130" | ||
| pr-link: XXX |
There was a problem hiding this comment.
🔴 The new perf-changelog.yaml entry at line 2557 has pr-link: XXX as an unfilled template placeholder. All neighboring entries use real GitHub PR URLs (e.g. lines 2532, 2538, 2544, 2550), so this should be replaced with https://github.com/SemiAnalysisAI/InferenceX/pull/1417 before merge.
Extended reasoning...
What the bug is
The new changelog entry added by this PR ends with:
- config-keys:
- glm5-fp4-b200-sglang
- glm5-fp4-b200-sglang-mtp
description:
- "Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130"
pr-link: XXXThe pr-link: XXX is clearly an unfilled template placeholder, not a real URL.
Why this is wrong
Every other entry in perf-changelog.yaml uses a real GitHub PR URL. For example, the four entries immediately preceding this one (lines 2532, 2538, 2544, 2550) all follow the pattern pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/NNNN. The literal string XXX is not a valid URL and does not point anywhere.
Impact
The pr-link field is metadata consumed by downstream tooling and humans browsing the changelog for traceability — given a perf regression or improvement, readers click the link to find the originating PR. Leaving XXX here breaks that traceability for the glm5-fp4-b200-sglang and glm5-fp4-b200-sglang-mtp SGLang v0.5.12 upgrade and would render as broken text in any tool that auto-linkifies these fields.
How to fix
Replace XXX with the URL of this PR:
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1417Step-by-step proof
- Open
perf-changelog.yamland jump to line 2557 — the line reads literallypr-link: XXX. - Compare with the preceding entry at line 2550:
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1402— a valid GitHub URL. - Confirm the same pattern at lines 2532, 2538, 2544 — all use real PR URLs.
- The current PR is Update glm5-fp4-b200-sglang and -mtp SGLang image to v0.5.12-cu130 #1417 (per PR metadata), so the correct value is
https://github.com/SemiAnalysisAI/InferenceX/pull/1417.
Note: a separate pr-link: XXX exists earlier in the file (around line 2502) from another PR's entry, but that pre-existing issue does not justify perpetuating the same mistake in this PR's new entry.
| - config-keys: | ||
| - glm5-fp4-b200-sglang | ||
| - glm5-fp4-b200-sglang-mtp | ||
| description: | ||
| - "Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130" | ||
| pr-link: XXX |
There was a problem hiding this comment.
🟡 The changelog entry at perf-changelog.yaml:2552-2557 lists both glm5-fp4-b200-sglang and glm5-fp4-b200-sglang-mtp under a single description 'Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130', but the -mtp variant was actually at v0.5.11-cu130 (not v0.5.10.post1-cu130) per the nvidia-master.yaml diff and the PR description itself. Either split the entry into two, or rewrite the description to call out both starting versions so the changelog accurately records history for the -mtp config-key.
Extended reasoning...
What the bug is
The new changelog entry added at perf-changelog.yaml lines 2552-2557 groups two config-keys under a single description:
- config-keys:
- glm5-fp4-b200-sglang
- glm5-fp4-b200-sglang-mtp
description:
- "Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130"
pr-link: XXXThe description claims both config-keys are moving from v0.5.10.post1-cu130 to v0.5.12-cu130. This is correct for the base glm5-fp4-b200-sglang variant but factually wrong for the -mtp variant.
Step-by-step proof
- In
.github/configs/nvidia-master.yamlline 2210, the diff shows:glm5-fp4-b200-sglangimage changes fromlmsysorg/sglang:v0.5.10.post1-cu130tolmsysorg/sglang:v0.5.12-cu130. So 'from v0.5.10.post1-cu130' is correct for this key. - In
.github/configs/nvidia-master.yamlline 2231, the diff shows:glm5-fp4-b200-sglang-mtpimage changes fromlmsysorg/sglang:v0.5.11-cu130tolmsysorg/sglang:v0.5.12-cu130. So the previous version for the-mtpkey was v0.5.11-cu130, not v0.5.10.post1-cu130. - The PR description itself acknowledges these distinct source versions: 'glm5-fp4-b200-sglang (from v0.5.10.post1-cu130) and glm5-fp4-b200-sglang-mtp (from v0.5.11-cu130) to v0.5.12-cu130.'
- Therefore the changelog description, as written, mis-records the source version for
glm5-fp4-b200-sglang-mtp.
Impact
perf-changelog.yaml appears to be an authoritative log of image-version transitions for each config-key. Anyone tracing the version history of glm5-fp4-b200-sglang-mtp via this changelog would conclude it jumped directly from v0.5.10.post1-cu130 to v0.5.12-cu130 and miss that it was already on v0.5.11-cu130. There is no runtime impact — this is documentation accuracy only — but the changelog is the kind of file that exists specifically to be correct.
How to fix
Either split the entry in two:
- config-keys:
- glm5-fp4-b200-sglang
description:
- "Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1417
- config-keys:
- glm5-fp4-b200-sglang-mtp
description:
- "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1417…or keep one entry and reword the description to list both starting versions (e.g. 'Update SGLang image to v0.5.12-cu130 (from v0.5.10.post1-cu130 for glm5-fp4-b200-sglang, from v0.5.11-cu130 for glm5-fp4-b200-sglang-mtp)'). Splitting is cleaner since the existing schema is one description per transition.
Also note pr-link: XXX is a placeholder that should be filled in with the actual PR URL before merge.
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25980013038 |
|
/reuse-sweep-run |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25983687346 |
Updates SGLang image for
glm5-fp4-b200-sglang(from v0.5.10.post1-cu130) andglm5-fp4-b200-sglang-mtp(from v0.5.11-cu130) to v0.5.12-cu130.\nRef #1154
Generated with Claude Code