Skip to content

Update dsr1-fp8-b300-sglang and -mtp SGLang image to v0.5.12-cu130#1419

Merged
functionstackx merged 2 commits into
mainfrom
claude/issue-1154-dsr1-fp8-b300-sglang
May 17, 2026
Merged

Update dsr1-fp8-b300-sglang and -mtp SGLang image to v0.5.12-cu130#1419
functionstackx merged 2 commits into
mainfrom
claude/issue-1154-dsr1-fp8-b300-sglang

Conversation

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Updates SGLang image for dsr1-fp8-b300-sglang (from v0.5.11-cu130) and dsr1-fp8-b300-sglang-mtp (from v0.5.10.post1-cu130) to v0.5.12-cu130.
\nRef #1154

Generated with Claude Code

… to v0.5.12-cu130

Ref #1154

Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Comment thread perf-changelog.yaml Outdated
Comment on lines +2552 to +2557
- config-keys:
- dsr1-fp8-b300-sglang
- dsr1-fp8-b300-sglang-mtp
description:
- "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"
pr-link: XXX
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The new perf-changelog.yaml entry has two issues: (1) pr-link: XXX is left as a placeholder instead of https://github.com/SemiAnalysisAI/InferenceX/pull/1419, breaking the file's convention; (2) the description "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130" is only correct for dsr1-fp8-b300-sglangdsr1-fp8-b300-sglang-mtp was on v0.5.10.post1-cu130 per the nvidia-master.yaml diff. Consider splitting into two entries or noting both source versions (e.g. "from v0.5.11-cu130 and v0.5.10.post1-cu130 respectively").

Extended reasoning...

Issue 1: Placeholder pr-link: XXX

Line 2557 of perf-changelog.yaml sets:

pr-link: XXX

Every other entry in the file uses the fully-qualified GitHub URL form, e.g. lines 2532, 2538, 2544, 2550 all use https://github.com/SemiAnalysisAI/InferenceX/pull/NNN. AGENTS.md line 123 even prescribes the template pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX where XXX is meant to be replaced with the actual PR number — here, 1419. The literal string XXX was clearly not substituted before commit.

Impact: any downstream tooling that follows pr-link as a URL will either error out or produce a garbage link, and human readers can no longer trace this changelog entry back to its source PR.

Fix: replace pr-link: XXX with pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1419.

Issue 2: Wrong source version for dsr1-fp8-b300-sglang-mtp

The new changelog entry groups two configs under a single description:

- config-keys:
    - dsr1-fp8-b300-sglang
    - dsr1-fp8-b300-sglang-mtp
  description:
    - "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"

But the nvidia-master.yaml diff in this same PR shows two different source versions:

config-key old image new image
dsr1-fp8-b300-sglang v0.5.11-cu130 v0.5.12-cu130
dsr1-fp8-b300-sglang-mtp v0.5.10.post1-cu130 v0.5.12-cu130

The PR description itself confirms this asymmetry: it explicitly says "dsr1-fp8-b300-sglang (from v0.5.11-cu130) and dsr1-fp8-b300-sglang-mtp (from v0.5.10.post1-cu130)". So the changelog description is only correct for the first config and is misleading for the -mtp variant, which jumped two minor versions, not one.

Step-by-step proof

  1. Open .github/configs/nvidia-master.yaml at the dsr1-fp8-b300-sglang block (around line 1936). Diff: image: lmsysorg/sglang:v0.5.11-cu130image: lmsysorg/sglang:v0.5.12-cu130. Source version: v0.5.11-cu130.
  2. Open the same file at the dsr1-fp8-b300-sglang-mtp block (around line 2568). Diff: image: lmsysorg/sglang:v0.5.10.post1-cu130image: lmsysorg/sglang:v0.5.12-cu130. Source version: v0.5.10.post1-cu130 — different from the first.
  3. Open perf-changelog.yaml lines 2552–2557. The single description string says "from v0.5.11-cu130 to v0.5.12-cu130" and applies to both config-keys above. For -mtp, this is wrong — the actual prior image was v0.5.10.post1-cu130.
  4. Same entry, line 2557: pr-link: XXX instead of https://github.com/SemiAnalysisAI/InferenceX/pull/1419.

Suggested fix

Either split into two entries:

- config-keys:
    - dsr1-fp8-b300-sglang
  description:
    - "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"
  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1419

- config-keys:
    - dsr1-fp8-b300-sglang-mtp
  description:
    - "Update SGLang image from v0.5.10.post1-cu130 to v0.5.12-cu130"
  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1419

…or keep a single entry and clarify both source versions in the description, e.g. "Update SGLang image to v0.5.12-cu130 (from v0.5.11-cu130 and v0.5.10.post1-cu130 respectively)", plus the real pr-link URL.

Comment on lines 2568 to 2569
# B200 SGLang MTP recipe as-is until B300-specific tuning is available. Image bumped
# to v0.5.10.post1-cu130 to match the standard B300 SGLang image used by other B300 configs.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The NOTE comment above dsr1-fp8-b300-sglang-mtp at lines 2568-2569 still says "Image bumped to v0.5.10.post1-cu130 to match the standard B300 SGLang image used by other B300 configs." but this PR bumps the image on the next line to v0.5.12-cu130, leaving the comment stale and self-contradictory. Either update the version to v0.5.12-cu130, or drop the specific version reference (the sister dsr1-fp8-b300-sglang block at lines 1934-1937 avoids naming a specific version).

Extended reasoning...

What the bug is

The block of comments at .github/configs/nvidia-master.yaml:2566-2569 documents why dsr1-fp8-b300-sglang-mtp reuses the B200 recipe and notes the specific image version it was pinned to:

  # B200 SGLang MTP recipe as-is until B300-specific tuning is available. Image bumped
  # to v0.5.10.post1-cu130 to match the standard B300 SGLang image used by other B300 configs.
dsr1-fp8-b300-sglang-mtp:
  image: lmsysorg/sglang:v0.5.12-cu130

The comment explicitly states "Image bumped to v0.5.10.post1-cu130", but the very next line now pins the image to v0.5.12-cu130. The doc claim and the actual value disagree.

The code path that triggers it

This is purely a documentation/comment correctness issue — there is no runtime path. The contradiction is visible to anyone reading the YAML file: the rationale comment names a version that no longer appears anywhere in the file.

Why existing code doesn't prevent it

YAML comments are free text and aren't validated against the keys/values they describe. There's no linter or test that cross-checks comment contents against image tags, so a stale rationale comment slips through silently when an image is bumped.

Step-by-step proof

  1. Open .github/configs/nvidia-master.yaml.
  2. Jump to line 2568. The comment reads: # to v0.5.10.post1-cu130 to match the standard B300 SGLang image used by other B300 configs.
  3. Read the next non-comment line (2571): image: lmsysorg/sglang:v0.5.12-cu130.
  4. Grep the file for v0.5.10.post1-cu130 — it appears only inside the now-stale comment; no image actually uses that version anymore.
  5. Compare with the sister block at lines 1934-1937 for dsr1-fp8-b300-sglang, whose comment avoids naming a specific version and therefore stays correct across image bumps.

Impact

No runtime impact. The harm is to readers/maintainers: anyone trying to understand why this MTP config uses a particular image version is led to a version that no longer exists in the file, undermining trust in the surrounding comments and making future bumps more error-prone.

How to fix

Either:

  • Update line 2568-2569 to reference v0.5.12-cu130, or
  • Drop the version-specific phrasing (e.g., "Image bumped to match the standard B300 SGLang image used by other B300 configs.") so the comment doesn't need to be re-edited on every version bump — matching the style already used by dsr1-fp8-b300-sglang. The second option is preferable because it prevents the same drift from recurring on the next bump.

@github-actions
Copy link
Copy Markdown
Contributor

@functionstackx
Copy link
Copy Markdown
Collaborator

/reuse-sweep-run

@functionstackx functionstackx merged commit 506258a into main May 17, 2026
8 of 10 checks passed
@functionstackx functionstackx deleted the claude/issue-1154-dsr1-fp8-b300-sglang branch May 17, 2026 05:40
@github-actions
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

2 participants