Update dsr1-fp4-mi355x-sglang SGLang image to v0.5.12-rocm700-mi35x by Klaud-Cold · Pull Request #1431 · SemiAnalysisAI/InferenceX

Klaud-Cold · 2026-05-17T03:16:43Z

Updates SGLang image for dsr1-fp4-mi355x-sglang from v0.5.9-rocm700-mi35x to v0.5.12-rocm700-mi35x.
\nRef #1154

Ref #1154 Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>

github-actions · 2026-05-17T03:16:50Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-17T03:16:51Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

claude · 2026-05-17T03:19:47Z

+- config-keys:
+    - dsr1-fp4-mi355x-sglang
+  description:
+    - "Update SGLang image from v0.5.9-rocm700-mi35x to v0.5.12-rocm700-mi35x"
+  pr-link: XXX


🟡 The new entry added to perf-changelog.yaml (line 2556) uses pr-link: XXX as a placeholder instead of an actual PR URL. All preceding entries follow the convention https://github.com/SemiAnalysisAI/InferenceX/pull/<num>, so this should be updated to https://github.com/SemiAnalysisAI/InferenceX/pull/1431 before merge.

Extended reasoning...

What's wrong

The diff adds a new entry to perf-changelog.yaml:

- config-keys: - dsr1-fp4-mi355x-sglang description: - "Update SGLang image from v0.5.9-rocm700-mi35x to v0.5.12-rocm700-mi35x" pr-link: XXX

The pr-link: XXX value is clearly a placeholder that was never filled in.

Why this breaks the file convention

Every preceding entry in perf-changelog.yaml follows a consistent pattern of using a full GitHub PR URL. Looking at the immediately preceding entries:

Line 2526: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/<num>

Line 2532: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/<num>

Line 2538: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/<num>

Line 2544: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/<num>

Line 2550: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1402 (the entry directly above the new one)

The placeholder XXX is not a valid URL and provides no traceability back to the originating PR.

Step-by-step proof

Open perf-changelog.yaml and scroll to the bottom of the file.

Observe lines 2552-2556 which are newly added by this PR.

At line 2556, the value of pr-link is the literal string XXX.

Compare against any prior entry (e.g., line 2550): the value is a fully-qualified GitHub PR URL.

Anyone (or any tool) attempting to follow this link or programmatically resolve PR numbers from the changelog will fail on this entry.

Impact

This is a documentation/changelog correctness issue only — it does not affect runtime behavior of the framework, deployment, or sweeps. However, the changelog is used to trace which PR caused which performance change, so a missing/placeholder link defeats the purpose of the entry.

Fix

Replace pr-link: XXX on line 2556 with pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1431 before merging.

github-actions · 2026-05-17T05:10:01Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25980026044
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25980026044

…Tokenizer.all_special_tokens_extended crash) The sglang v0.5.12 container bundles a newer transformers library where LlamaTokenizer.all_special_tokens_extended was removed. vLLM's get_cached_tokenizer (called from vllm.transformers_utils.tokenizer.get_tokenizer) probes that attribute and crashes the benchmark client before any request is sent. Swap the import preference order so backend_request_func.get_tokenizer (this repo's HuggingFace AutoTokenizer wrapper, with its own _fix_tokenizer_for_sglang post-hoc patcher) is tried first. The vLLM import remains as a fallback for environments without backend_request_func.

) Root-caused via the failed sweeps on #1431, #1432, #1440, #1441, #1443 — every failure landed on either: mia1-p01-g09 pyxis: failed to create container filesystem (extended attributes not supported on the destination filesystem; pyxis can't mount the squashfs) mia1-p01-g11 permission denied while trying to connect to docker.sock (cluster-cleanup `docker stop` step fails; cascading into pyxis-init failure) Both are already known-bad per KLAUD_DEBUG.md §5.1 / §5.2, but the launcher wasn't excluding them. This mirrors the existing pattern in runners/launch_mi300x-amds.sh (#1462 — pin to known-good nodes) and runners/launch_mi325x-amds.sh (#1477 — exclude chi-mi325x-pod1-121). Once this lands the 5 affected mi355x PRs can be rebased to pick it up and the failed jobs will land on healthy nodes only. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # perf-changelog.yaml

github-actions · 2026-05-18T16:30:47Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26008680533
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26008680533

The agentic-coding matrix for dsr1-fp4-mi355x-sglang fans out into 10 matrix jobs (conc-list spans 1..256), which roughly doubles this image bump PR's sweep cost without adding signal — the bump only needs the fixed-seq-len throughput shape to validate. Re-enable after the PR merges; the next agentic cron PR will pick it up.

github-actions · 2026-05-18T16:47:55Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26046376775
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26046376775

github-actions · 2026-05-19T02:38:54Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26047312829
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26047312829

functionstackx · 2026-05-19T02:51:19Z

/reuse-sweep-run

github-actions · 2026-05-19T02:51:58Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26073210292
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26073210292

Update dsr1-fp4-mi355x-sglang SGLang image to v0.5.12-rocm700-mi35x

ec26a46

Ref #1154 Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>

Klaud-Cold requested a review from a team May 17, 2026 03:16

Klaud-Cold added the full-sweep-enabled label May 17, 2026

Klaud-Cold requested review from 1am9trash, billishyahao, chunfangamd, seungrokj and yctseng0211 as code owners May 17, 2026 03:16

github-project-automation Bot added this to InferenceMAX Board May 17, 2026

Klaud-Cold mentioned this pull request May 17, 2026

[Auto] Docker Image Updates Available - 2026-04-25 #1154

Open

claude Bot reviewed May 17, 2026

View reviewed changes

functionstackx mentioned this pull request May 18, 2026

[AMD] Upgrade DeepSeek-R1 MI35x docker to the latest SGLang version 0.5.10 #1013

Closed

functionstackx mentioned this pull request May 18, 2026

[Klaud Cold] runners(mi355x): exclude broken nodes mia1-p01-g09 + mia1-p01-g11 #1498

Merged

2 tasks

Merge remote-tracking branch 'origin/main' into HEAD

32a25cf

# Conflicts: # perf-changelog.yaml

Merge branch 'main' into claude/issue-1154-dsr1-fp4-mi355x-sglang

7801243

functionstackx merged commit 8a20938 into main May 19, 2026
4 of 5 checks passed

functionstackx deleted the claude/issue-1154-dsr1-fp4-mi355x-sglang branch May 19, 2026 02:51

github-project-automation Bot moved this to Done in InferenceMAX Board May 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update dsr1-fp4-mi355x-sglang SGLang image to v0.5.12-rocm700-mi35x#1431

Update dsr1-fp4-mi355x-sglang SGLang image to v0.5.12-rocm700-mi35x#1431
functionstackx merged 5 commits into
mainfrom
claude/issue-1154-dsr1-fp4-mi355x-sglang

Klaud-Cold commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

claude Bot May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

functionstackx commented May 19, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Klaud-Cold commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

claude Bot May 17, 2026

Choose a reason for hiding this comment

What's wrong

Why this breaks the file convention

Step-by-step proof

Impact

Fix

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

functionstackx commented May 19, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants