Skip to content

Update dsr1-fp4-mi355x-sglang SGLang image to v0.5.12-rocm700-mi35x#1431

Merged
functionstackx merged 5 commits into
mainfrom
claude/issue-1154-dsr1-fp4-mi355x-sglang
May 19, 2026
Merged

Update dsr1-fp4-mi355x-sglang SGLang image to v0.5.12-rocm700-mi35x#1431
functionstackx merged 5 commits into
mainfrom
claude/issue-1154-dsr1-fp4-mi355x-sglang

Conversation

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Updates SGLang image for dsr1-fp4-mi355x-sglang from v0.5.9-rocm700-mi35x to v0.5.12-rocm700-mi35x.
\nRef #1154

Generated with Claude Code

Ref #1154

Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Comment thread perf-changelog.yaml
Comment on lines +2552 to +2556
- config-keys:
- dsr1-fp4-mi355x-sglang
description:
- "Update SGLang image from v0.5.9-rocm700-mi35x to v0.5.12-rocm700-mi35x"
pr-link: XXX
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The new entry added to perf-changelog.yaml (line 2556) uses pr-link: XXX as a placeholder instead of an actual PR URL. All preceding entries follow the convention https://github.com/SemiAnalysisAI/InferenceX/pull/<num>, so this should be updated to https://github.com/SemiAnalysisAI/InferenceX/pull/1431 before merge.

Extended reasoning...

What's wrong

The diff adds a new entry to perf-changelog.yaml:

- config-keys:
    - dsr1-fp4-mi355x-sglang
  description:
    - "Update SGLang image from v0.5.9-rocm700-mi35x to v0.5.12-rocm700-mi35x"
  pr-link: XXX

The pr-link: XXX value is clearly a placeholder that was never filled in.

Why this breaks the file convention

Every preceding entry in perf-changelog.yaml follows a consistent pattern of using a full GitHub PR URL. Looking at the immediately preceding entries:

  • Line 2526: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/<num>
  • Line 2532: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/<num>
  • Line 2538: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/<num>
  • Line 2544: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/<num>
  • Line 2550: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1402 (the entry directly above the new one)

The placeholder XXX is not a valid URL and provides no traceability back to the originating PR.

Step-by-step proof

  1. Open perf-changelog.yaml and scroll to the bottom of the file.
  2. Observe lines 2552-2556 which are newly added by this PR.
  3. At line 2556, the value of pr-link is the literal string XXX.
  4. Compare against any prior entry (e.g., line 2550): the value is a fully-qualified GitHub PR URL.
  5. Anyone (or any tool) attempting to follow this link or programmatically resolve PR numbers from the changelog will fail on this entry.

Impact

This is a documentation/changelog correctness issue only — it does not affect runtime behavior of the framework, deployment, or sweeps. However, the changelog is used to trace which PR caused which performance change, so a missing/placeholder link defeats the purpose of the entry.

Fix

Replace pr-link: XXX on line 2556 with pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1431 before merging.

@github-actions
Copy link
Copy Markdown
Contributor

…Tokenizer.all_special_tokens_extended crash)

The sglang v0.5.12 container bundles a newer transformers library where
LlamaTokenizer.all_special_tokens_extended was removed. vLLM's
get_cached_tokenizer (called from vllm.transformers_utils.tokenizer.get_tokenizer)
probes that attribute and crashes the benchmark client before any request
is sent.

Swap the import preference order so backend_request_func.get_tokenizer
(this repo's HuggingFace AutoTokenizer wrapper, with its own
_fix_tokenizer_for_sglang post-hoc patcher) is tried first. The vLLM
import remains as a fallback for environments without backend_request_func.
functionstackx added a commit that referenced this pull request May 18, 2026
)

Root-caused via the failed sweeps on #1431, #1432, #1440, #1441,
#1443 — every failure landed on either:

  mia1-p01-g09  pyxis: failed to create container filesystem
                (extended attributes not supported on the destination
                filesystem; pyxis can't mount the squashfs)
  mia1-p01-g11  permission denied while trying to connect to docker.sock
                (cluster-cleanup `docker stop` step fails; cascading
                into pyxis-init failure)

Both are already known-bad per KLAUD_DEBUG.md §5.1 / §5.2, but the
launcher wasn't excluding them. This mirrors the existing pattern in
runners/launch_mi300x-amds.sh (#1462 — pin to known-good nodes) and
runners/launch_mi325x-amds.sh (#1477 — exclude chi-mi325x-pod1-121).

Once this lands the 5 affected mi355x PRs can be rebased to pick it up
and the failed jobs will land on healthy nodes only.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	perf-changelog.yaml
@github-actions
Copy link
Copy Markdown
Contributor

The agentic-coding matrix for dsr1-fp4-mi355x-sglang fans out into 10
matrix jobs (conc-list spans 1..256), which roughly doubles this image
bump PR's sweep cost without adding signal — the bump only needs the
fixed-seq-len throughput shape to validate. Re-enable after the PR
merges; the next agentic cron PR will pick it up.
@github-actions
Copy link
Copy Markdown
Contributor

@github-actions
Copy link
Copy Markdown
Contributor

@functionstackx
Copy link
Copy Markdown
Collaborator

/reuse-sweep-run

@functionstackx functionstackx merged commit 8a20938 into main May 19, 2026
4 of 5 checks passed
@functionstackx functionstackx deleted the claude/issue-1154-dsr1-fp4-mi355x-sglang branch May 19, 2026 02:51
@github-actions
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

2 participants