Update dsr1-fp8-b300-sglang and -mtp SGLang image to v0.5.12-cu130 by Klaud-Cold · Pull Request #1419 · SemiAnalysisAI/InferenceX

Klaud-Cold · 2026-05-17T03:16:09Z

Updates SGLang image for dsr1-fp8-b300-sglang (from v0.5.11-cu130) and dsr1-fp8-b300-sglang-mtp (from v0.5.10.post1-cu130) to v0.5.12-cu130.
\nRef #1154

Generated with Claude Code

… to v0.5.12-cu130 Ref #1154 Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>

github-actions · 2026-05-17T03:16:16Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

claude · 2026-05-17T03:19:59Z

+- config-keys:
+    - dsr1-fp8-b300-sglang
+    - dsr1-fp8-b300-sglang-mtp
+  description:
+    - "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"
+  pr-link: XXX


🔴 The new perf-changelog.yaml entry has two issues: (1) pr-link: XXX is left as a placeholder instead of https://github.com/SemiAnalysisAI/InferenceX/pull/1419, breaking the file's convention; (2) the description "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130" is only correct for dsr1-fp8-b300-sglang — dsr1-fp8-b300-sglang-mtp was on v0.5.10.post1-cu130 per the nvidia-master.yaml diff. Consider splitting into two entries or noting both source versions (e.g. "from v0.5.11-cu130 and v0.5.10.post1-cu130 respectively").

Extended reasoning...

Issue 1: Placeholder pr-link: XXX

Line 2557 of perf-changelog.yaml sets:

pr-link: XXX

Every other entry in the file uses the fully-qualified GitHub URL form, e.g. lines 2532, 2538, 2544, 2550 all use https://github.com/SemiAnalysisAI/InferenceX/pull/NNN. AGENTS.md line 123 even prescribes the template pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX where XXX is meant to be replaced with the actual PR number — here, 1419. The literal string XXX was clearly not substituted before commit.

Impact: any downstream tooling that follows pr-link as a URL will either error out or produce a garbage link, and human readers can no longer trace this changelog entry back to its source PR.

Fix: replace pr-link: XXX with pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1419.

Issue 2: Wrong source version for dsr1-fp8-b300-sglang-mtp

The new changelog entry groups two configs under a single description:

- config-keys: - dsr1-fp8-b300-sglang - dsr1-fp8-b300-sglang-mtp description: - "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"

But the nvidia-master.yaml diff in this same PR shows two different source versions:

config-key old image new image

dsr1-fp8-b300-sglang v0.5.11-cu130 v0.5.12-cu130

dsr1-fp8-b300-sglang-mtp v0.5.10.post1-cu130 v0.5.12-cu130

The PR description itself confirms this asymmetry: it explicitly says "dsr1-fp8-b300-sglang (from v0.5.11-cu130) and dsr1-fp8-b300-sglang-mtp (from v0.5.10.post1-cu130)". So the changelog description is only correct for the first config and is misleading for the -mtp variant, which jumped two minor versions, not one.

Step-by-step proof

Open .github/configs/nvidia-master.yaml at the dsr1-fp8-b300-sglang block (around line 1936). Diff: image: lmsysorg/sglang:v0.5.11-cu130 → image: lmsysorg/sglang:v0.5.12-cu130. Source version: v0.5.11-cu130.

Open the same file at the dsr1-fp8-b300-sglang-mtp block (around line 2568). Diff: image: lmsysorg/sglang:v0.5.10.post1-cu130 → image: lmsysorg/sglang:v0.5.12-cu130. Source version: v0.5.10.post1-cu130 — different from the first.

Open perf-changelog.yaml lines 2552–2557. The single description string says "from v0.5.11-cu130 to v0.5.12-cu130" and applies to both config-keys above. For -mtp, this is wrong — the actual prior image was v0.5.10.post1-cu130.

Same entry, line 2557: pr-link: XXX instead of https://github.com/SemiAnalysisAI/InferenceX/pull/1419.

Suggested fix

Either split into two entries:

- config-keys: - dsr1-fp8-b300-sglang description: - "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130" pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1419 - config-keys: - dsr1-fp8-b300-sglang-mtp description: - "Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130" pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1419

…or keep a single entry and clarify both source versions in the description, e.g. "Update SGLang image to v0.5.12-cu130 (from v0.5.11-cu130 and v0.5.10.post1-cu130 respectively)", plus the real pr-link URL.

claude · 2026-05-17T03:19:59Z

  # B200 SGLang MTP recipe as-is until B300-specific tuning is available. Image bumped
  # to v0.5.10.post1-cu130 to match the standard B300 SGLang image used by other B300 configs.


🟡 The NOTE comment above dsr1-fp8-b300-sglang-mtp at lines 2568-2569 still says "Image bumped to v0.5.10.post1-cu130 to match the standard B300 SGLang image used by other B300 configs." but this PR bumps the image on the next line to v0.5.12-cu130, leaving the comment stale and self-contradictory. Either update the version to v0.5.12-cu130, or drop the specific version reference (the sister dsr1-fp8-b300-sglang block at lines 1934-1937 avoids naming a specific version).

Extended reasoning...

What the bug is

The block of comments at .github/configs/nvidia-master.yaml:2566-2569 documents why dsr1-fp8-b300-sglang-mtp reuses the B200 recipe and notes the specific image version it was pinned to:

# B200 SGLang MTP recipe as-is until B300-specific tuning is available. Image bumped # to v0.5.10.post1-cu130 to match the standard B300 SGLang image used by other B300 configs. dsr1-fp8-b300-sglang-mtp: image: lmsysorg/sglang:v0.5.12-cu130

The comment explicitly states "Image bumped to v0.5.10.post1-cu130", but the very next line now pins the image to v0.5.12-cu130. The doc claim and the actual value disagree.

The code path that triggers it

This is purely a documentation/comment correctness issue — there is no runtime path. The contradiction is visible to anyone reading the YAML file: the rationale comment names a version that no longer appears anywhere in the file.

Why existing code doesn't prevent it

YAML comments are free text and aren't validated against the keys/values they describe. There's no linter or test that cross-checks comment contents against image tags, so a stale rationale comment slips through silently when an image is bumped.

Step-by-step proof

Open .github/configs/nvidia-master.yaml.

Jump to line 2568. The comment reads: # to v0.5.10.post1-cu130 to match the standard B300 SGLang image used by other B300 configs.

Read the next non-comment line (2571): image: lmsysorg/sglang:v0.5.12-cu130.

Grep the file for v0.5.10.post1-cu130 — it appears only inside the now-stale comment; no image actually uses that version anymore.

Compare with the sister block at lines 1934-1937 for dsr1-fp8-b300-sglang, whose comment avoids naming a specific version and therefore stays correct across image bumps.

Impact

No runtime impact. The harm is to readers/maintainers: anyone trying to understand why this MTP config uses a particular image version is led to a version that no longer exists in the file, undermining trust in the surrounding comments and making future bumps more error-prone.

How to fix

Either:

Update line 2568-2569 to reference v0.5.12-cu130, or

Drop the version-specific phrasing (e.g., "Image bumped to match the standard B300 SGLang image used by other B300 configs.") so the comment doesn't need to be re-edited on every version bump — matching the style already used by dsr1-fp8-b300-sglang. The second option is preferable because it prevents the same drift from recurring on the next bump.

github-actions · 2026-05-17T05:02:14Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25980015344
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25980015344

functionstackx · 2026-05-17T05:40:31Z

/reuse-sweep-run

github-actions · 2026-05-17T05:41:07Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25982609731
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25982609731

Update dsr1-fp8-b300-sglang and dsr1-fp8-b300-sglang-mtp SGLang image…

0c52441

… to v0.5.12-cu130 Ref #1154 Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>

Klaud-Cold requested a review from a team May 17, 2026 03:16

Klaud-Cold added the full-sweep-enabled label May 17, 2026

Klaud-Cold requested review from jgangani and kedarpotdar-nv as code owners May 17, 2026 03:16

github-project-automation Bot added this to InferenceMAX Board May 17, 2026

Klaud-Cold mentioned this pull request May 17, 2026

[Auto] Docker Image Updates Available - 2026-04-25 #1154

Open

claude Bot reviewed May 17, 2026

View reviewed changes

Merge branch 'main' into claude/issue-1154-dsr1-fp8-b300-sglang

0970086

functionstackx merged commit 506258a into main May 17, 2026
8 of 10 checks passed

functionstackx deleted the claude/issue-1154-dsr1-fp8-b300-sglang branch May 17, 2026 05:40

github-project-automation Bot moved this to Done in InferenceMAX Board May 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update dsr1-fp8-b300-sglang and -mtp SGLang image to v0.5.12-cu130#1419

Update dsr1-fp8-b300-sglang and -mtp SGLang image to v0.5.12-cu130#1419
functionstackx merged 2 commits into
mainfrom
claude/issue-1154-dsr1-fp8-b300-sglang

Klaud-Cold commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

claude Bot May 17, 2026

Uh oh!

claude Bot May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

functionstackx commented May 17, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

config-key	old image	new image
`dsr1-fp8-b300-sglang`	`v0.5.11-cu130`	`v0.5.12-cu130`
`dsr1-fp8-b300-sglang-mtp`	`v0.5.10.post1-cu130`	`v0.5.12-cu130`

		# B200 SGLang MTP recipe as-is until B300-specific tuning is available. Image bumped
		# to v0.5.10.post1-cu130 to match the standard B300 SGLang image used by other B300 configs.

Conversation

Klaud-Cold commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

claude Bot May 17, 2026

Choose a reason for hiding this comment

Issue 1: Placeholder pr-link: XXX

Issue 2: Wrong source version for dsr1-fp8-b300-sglang-mtp

Step-by-step proof

Suggested fix

Uh oh!

claude Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

functionstackx commented May 17, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Issue 1: Placeholder `pr-link: XXX`

Issue 2: Wrong source version for `dsr1-fp8-b300-sglang-mtp`