Update dsr1-fp8-mi325x-sglang SGLang image to v0.5.12-rocm700-mi30x#1428
Conversation
Ref #1154 Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
2 similar comments
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
| - dsr1-fp8-mi325x-sglang | ||
| description: | ||
| - "Update SGLang image from v0.5.9-rocm700-mi30x to v0.5.12-rocm700-mi30x" | ||
| pr-link: XXX |
There was a problem hiding this comment.
🔴 The new perf-changelog entry has pr-link: XXX at line 2556 instead of the actual PR URL. Every other entry in this file uses a real GitHub PR URL (e.g., line 2553 uses #1402). Please update this to https://github.com/SemiAnalysisAI/InferenceX/pull/1428 before merging so the changelog has a working link.
Extended reasoning...
What the bug is
The diff at perf-changelog.yaml:2556 adds a new changelog entry whose pr-link field is the literal string XXX rather than a valid URL:
- config-keys:
- dsr1-fp8-mi325x-sglang
description:
- "Update SGLang image from v0.5.9-rocm700-mi30x to v0.5.12-rocm700-mi30x"
pr-link: XXXThis is clearly an unfilled placeholder from a PR generation template — it was never replaced with the actual PR URL before the diff was committed.
How it manifests
Every other entry in perf-changelog.yaml uses the canonical GitHub URL format. The immediately preceding entries (lines 2526, 2532, 2538, 2544, 2550) all follow the pattern pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/<N> for PRs #1406, #1407, #1408, #1409, and #1402 respectively. The new entry is the only one with a placeholder, breaking this consistency.
Why existing checks don't prevent it
YAML treats XXX as a valid plain scalar string, so the file still parses cleanly and any schema validation that only checks the field is present (rather than that it's a URL) will pass. Without a URL-format validator on pr-link, the placeholder slips through.
Impact
The changelog is user-facing metadata. Once merged, the broken pointer becomes permanent in repo history and any tooling/UI that surfaces pr-link (e.g., to link back from a changelog viewer to the PR discussion) will produce a dead link or render XXX as raw text. This makes the entry less traceable than every other entry in the file.
Step-by-step proof
- Read the diff: the only addition to
perf-changelog.yamlis a 6-line block at lines 2551–2556. - Line 2556 is
pr-link: XXX. - Compare with the preceding entry at lines 2548–2550:
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1402. - Imagine a downstream tool rendering the changelog: for the new entry it would attempt to hyperlink
XXX, which is not a URL — the link is broken or omitted. - The PR's own number is Update dsr1-fp8-mi325x-sglang SGLang image to v0.5.12-rocm700-mi30x #1428, so the correct value is
https://github.com/SemiAnalysisAI/InferenceX/pull/1428.
How to fix
Replace line 2556 with:
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1428Optionally, add a schema check or pre-merge linter that requires pr-link to match ^https://github\.com/.+/pull/\d+$ to prevent the same template-fill mistake in the future.
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25980023461 |
…Tokenizer.all_special_tokens_extended crash) The sglang v0.5.12 container bundles a newer transformers library where LlamaTokenizer.all_special_tokens_extended was removed. vLLM's get_cached_tokenizer (called from vllm.transformers_utils.tokenizer.get_tokenizer) probes that attribute and crashes the benchmark client before any request is sent. Swap the import preference order so backend_request_func.get_tokenizer (this repo's HuggingFace AutoTokenizer wrapper, with its own _fix_tokenizer_for_sglang post-hoc patcher) is tried first. The vLLM import remains as a fallback for environments without backend_request_func.
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26008679544 |
|
/reuse-sweep-run |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26010758681 |
Updates SGLang image for
dsr1-fp8-mi325x-sglangfrom v0.5.9-rocm700-mi30x to v0.5.12-rocm700-mi30x.\nRef #1154
Generated with Claude Code