Skip to content

Update glm5-fp4-b200-sglang and -mtp SGLang image to v0.5.12-cu130#1417

Merged
functionstackx merged 2 commits into
mainfrom
claude/issue-1154-glm5-fp4-b200-sglang
May 17, 2026
Merged

Update glm5-fp4-b200-sglang and -mtp SGLang image to v0.5.12-cu130#1417
functionstackx merged 2 commits into
mainfrom
claude/issue-1154-glm5-fp4-b200-sglang

Conversation

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Updates SGLang image for glm5-fp4-b200-sglang (from v0.5.10.post1-cu130) and glm5-fp4-b200-sglang-mtp (from v0.5.11-cu130) to v0.5.12-cu130.
\nRef #1154

Generated with Claude Code

… to v0.5.12-cu130

Ref #1154

Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Comment thread perf-changelog.yaml Outdated
- glm5-fp4-b200-sglang-mtp
description:
- "Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130"
pr-link: XXX
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The new perf-changelog.yaml entry at line 2557 has pr-link: XXX as an unfilled template placeholder. All neighboring entries use real GitHub PR URLs (e.g. lines 2532, 2538, 2544, 2550), so this should be replaced with https://github.com/SemiAnalysisAI/InferenceX/pull/1417 before merge.

Extended reasoning...

What the bug is

The new changelog entry added by this PR ends with:

- config-keys:
    - glm5-fp4-b200-sglang
    - glm5-fp4-b200-sglang-mtp
  description:
    - "Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130"
  pr-link: XXX

The pr-link: XXX is clearly an unfilled template placeholder, not a real URL.

Why this is wrong

Every other entry in perf-changelog.yaml uses a real GitHub PR URL. For example, the four entries immediately preceding this one (lines 2532, 2538, 2544, 2550) all follow the pattern pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/NNNN. The literal string XXX is not a valid URL and does not point anywhere.

Impact

The pr-link field is metadata consumed by downstream tooling and humans browsing the changelog for traceability — given a perf regression or improvement, readers click the link to find the originating PR. Leaving XXX here breaks that traceability for the glm5-fp4-b200-sglang and glm5-fp4-b200-sglang-mtp SGLang v0.5.12 upgrade and would render as broken text in any tool that auto-linkifies these fields.

How to fix

Replace XXX with the URL of this PR:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1417

Step-by-step proof

  1. Open perf-changelog.yaml and jump to line 2557 — the line reads literally pr-link: XXX.
  2. Compare with the preceding entry at line 2550: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1402 — a valid GitHub URL.
  3. Confirm the same pattern at lines 2532, 2538, 2544 — all use real PR URLs.
  4. The current PR is Update glm5-fp4-b200-sglang and -mtp SGLang image to v0.5.12-cu130 #1417 (per PR metadata), so the correct value is https://github.com/SemiAnalysisAI/InferenceX/pull/1417.

Note: a separate pr-link: XXX exists earlier in the file (around line 2502) from another PR's entry, but that pre-existing issue does not justify perpetuating the same mistake in this PR's new entry.

Comment thread perf-changelog.yaml Outdated
Comment on lines +2552 to +2557
- config-keys:
- glm5-fp4-b200-sglang
- glm5-fp4-b200-sglang-mtp
description:
- "Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130"
pr-link: XXX
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The changelog entry at perf-changelog.yaml:2552-2557 lists both glm5-fp4-b200-sglang and glm5-fp4-b200-sglang-mtp under a single description 'Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130', but the -mtp variant was actually at v0.5.11-cu130 (not v0.5.10.post1-cu130) per the nvidia-master.yaml diff and the PR description itself. Either split the entry into two, or rewrite the description to call out both starting versions so the changelog accurately records history for the -mtp config-key.

Extended reasoning...

What the bug is

The new changelog entry added at perf-changelog.yaml lines 2552-2557 groups two config-keys under a single description:

- config-keys:
    - glm5-fp4-b200-sglang
    - glm5-fp4-b200-sglang-mtp
  description:
    - "Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130"
  pr-link: XXX

The description claims both config-keys are moving from v0.5.10.post1-cu130 to v0.5.12-cu130. This is correct for the base glm5-fp4-b200-sglang variant but factually wrong for the -mtp variant.

Step-by-step proof

  1. In .github/configs/nvidia-master.yaml line 2210, the diff shows: glm5-fp4-b200-sglang image changes from lmsysorg/sglang:v0.5.10.post1-cu130 to lmsysorg/sglang:v0.5.12-cu130. So 'from v0.5.10.post1-cu130' is correct for this key.
  2. In .github/configs/nvidia-master.yaml line 2231, the diff shows: glm5-fp4-b200-sglang-mtp image changes from lmsysorg/sglang:v0.5.11-cu130 to lmsysorg/sglang:v0.5.12-cu130. So the previous version for the -mtp key was v0.5.11-cu130, not v0.5.10.post1-cu130.
  3. The PR description itself acknowledges these distinct source versions: 'glm5-fp4-b200-sglang (from v0.5.10.post1-cu130) and glm5-fp4-b200-sglang-mtp (from v0.5.11-cu130) to v0.5.12-cu130.'
  4. Therefore the changelog description, as written, mis-records the source version for glm5-fp4-b200-sglang-mtp.

Impact

perf-changelog.yaml appears to be an authoritative log of image-version transitions for each config-key. Anyone tracing the version history of glm5-fp4-b200-sglang-mtp via this changelog would conclude it jumped directly from v0.5.10.post1-cu130 to v0.5.12-cu130 and miss that it was already on v0.5.11-cu130. There is no runtime impact — this is documentation accuracy only — but the changelog is the kind of file that exists specifically to be correct.

How to fix

Either split the entry in two:

- config-keys:
    - glm5-fp4-b200-sglang
  description:
    - "Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130"
  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1417

- config-keys:
    - glm5-fp4-b200-sglang-mtp
  description:
    - "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"
  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1417

…or keep one entry and reword the description to list both starting versions (e.g. 'Update SGLang image to v0.5.12-cu130 (from v0.5.10.post1-cu130 for glm5-fp4-b200-sglang, from v0.5.11-cu130 for glm5-fp4-b200-sglang-mtp)'). Splitting is cleaner since the existing schema is one description per transition.

Also note pr-link: XXX is a placeholder that should be filled in with the actual PR URL before merge.

@github-actions
Copy link
Copy Markdown
Contributor

@functionstackx
Copy link
Copy Markdown
Collaborator

/reuse-sweep-run

@functionstackx functionstackx merged commit 28ba1fa into main May 17, 2026
12 of 18 checks passed
@functionstackx functionstackx deleted the claude/issue-1154-glm5-fp4-b200-sglang branch May 17, 2026 06:38
@github-actions
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

2 participants