Skip to content

Update dsr1-fp8-mi325x-sglang SGLang image to v0.5.12-rocm700-mi30x#1428

Merged
functionstackx merged 3 commits into
mainfrom
claude/issue-1154-dsr1-fp8-mi325x-sglang
May 18, 2026
Merged

Update dsr1-fp8-mi325x-sglang SGLang image to v0.5.12-rocm700-mi30x#1428
functionstackx merged 3 commits into
mainfrom
claude/issue-1154-dsr1-fp8-mi325x-sglang

Conversation

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Updates SGLang image for dsr1-fp8-mi325x-sglang from v0.5.9-rocm700-mi30x to v0.5.12-rocm700-mi30x.
\nRef #1154

Generated with Claude Code

Ref #1154

Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

2 similar comments
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Comment thread perf-changelog.yaml Outdated
- dsr1-fp8-mi325x-sglang
description:
- "Update SGLang image from v0.5.9-rocm700-mi30x to v0.5.12-rocm700-mi30x"
pr-link: XXX
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The new perf-changelog entry has pr-link: XXX at line 2556 instead of the actual PR URL. Every other entry in this file uses a real GitHub PR URL (e.g., line 2553 uses #1402). Please update this to https://github.com/SemiAnalysisAI/InferenceX/pull/1428 before merging so the changelog has a working link.

Extended reasoning...

What the bug is

The diff at perf-changelog.yaml:2556 adds a new changelog entry whose pr-link field is the literal string XXX rather than a valid URL:

- config-keys:
    - dsr1-fp8-mi325x-sglang
  description:
    - "Update SGLang image from v0.5.9-rocm700-mi30x to v0.5.12-rocm700-mi30x"
  pr-link: XXX

This is clearly an unfilled placeholder from a PR generation template — it was never replaced with the actual PR URL before the diff was committed.

How it manifests

Every other entry in perf-changelog.yaml uses the canonical GitHub URL format. The immediately preceding entries (lines 2526, 2532, 2538, 2544, 2550) all follow the pattern pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/<N> for PRs #1406, #1407, #1408, #1409, and #1402 respectively. The new entry is the only one with a placeholder, breaking this consistency.

Why existing checks don't prevent it

YAML treats XXX as a valid plain scalar string, so the file still parses cleanly and any schema validation that only checks the field is present (rather than that it's a URL) will pass. Without a URL-format validator on pr-link, the placeholder slips through.

Impact

The changelog is user-facing metadata. Once merged, the broken pointer becomes permanent in repo history and any tooling/UI that surfaces pr-link (e.g., to link back from a changelog viewer to the PR discussion) will produce a dead link or render XXX as raw text. This makes the entry less traceable than every other entry in the file.

Step-by-step proof

  1. Read the diff: the only addition to perf-changelog.yaml is a 6-line block at lines 2551–2556.
  2. Line 2556 is pr-link: XXX.
  3. Compare with the preceding entry at lines 2548–2550: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1402.
  4. Imagine a downstream tool rendering the changelog: for the new entry it would attempt to hyperlink XXX, which is not a URL — the link is broken or omitted.
  5. The PR's own number is Update dsr1-fp8-mi325x-sglang SGLang image to v0.5.12-rocm700-mi30x #1428, so the correct value is https://github.com/SemiAnalysisAI/InferenceX/pull/1428.

How to fix

Replace line 2556 with:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1428

Optionally, add a schema check or pre-merge linter that requires pr-link to match ^https://github\.com/.+/pull/\d+$ to prevent the same template-fill mistake in the future.

@github-actions
Copy link
Copy Markdown
Contributor

…Tokenizer.all_special_tokens_extended crash)

The sglang v0.5.12 container bundles a newer transformers library where
LlamaTokenizer.all_special_tokens_extended was removed. vLLM's
get_cached_tokenizer (called from vllm.transformers_utils.tokenizer.get_tokenizer)
probes that attribute and crashes the benchmark client before any request
is sent.

Swap the import preference order so backend_request_func.get_tokenizer
(this repo's HuggingFace AutoTokenizer wrapper, with its own
_fix_tokenizer_for_sglang post-hoc patcher) is tried first. The vLLM
import remains as a fallback for environments without backend_request_func.
@github-actions
Copy link
Copy Markdown
Contributor

@functionstackx
Copy link
Copy Markdown
Collaborator

/reuse-sweep-run

@functionstackx functionstackx merged commit 57b5dbd into main May 18, 2026
3 of 5 checks passed
@functionstackx functionstackx deleted the claude/issue-1154-dsr1-fp8-mi325x-sglang branch May 18, 2026 02:45
@github-actions
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

2 participants