Update qwen3.5-fp8-h200-sglang-mtp SGLang image to v0.5.12#1424
Conversation
Ref #1154 Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
2 similar comments
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25980020002 |
| - config-keys: | ||
| - qwen3.5-fp8-h200-sglang-mtp | ||
| description: | ||
| - "Update SGLang image from v0.5.11 to v0.5.12" | ||
| pr-link: XXX |
There was a problem hiding this comment.
🟡 The new entry at perf-changelog.yaml:2556 has pr-link: XXX instead of an actual URL. Replace the placeholder with https://github.com/SemiAnalysisAI/InferenceX/pull/1424 before merge so the changelog can be traced back to this PR, matching the format used by every other entry in the file (e.g., lines 2526, 2532, 2538, 2544, 2550).
Extended reasoning...
What the bug is
The newly added entry in perf-changelog.yaml (lines 2552-2556) ends with:
- config-keys:
- qwen3.5-fp8-h200-sglang-mtp
description:
- "Update SGLang image from v0.5.11 to v0.5.12"
pr-link: XXXThe pr-link field contains the literal placeholder XXX rather than a real PR URL. The intended value is https://github.com/SemiAnalysisAI/InferenceX/pull/1424 (the URL of this PR).
Why this matters
perf-changelog.yaml is consumed as an index that maps each performance-relevant change to the PR that introduced it. Every other entry in the file follows the convention pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/NNNN, for example:
- line 2526:
pull/1414 - line 2532:
pull/1408 - line 2538:
pull/1407 - line 2544:
pull/1409 - line 2550:
pull/1402
With XXX left in place, anyone reading the changelog later cannot trace the SGLang v0.5.11 → v0.5.12 bump back to its source PR, which defeats the purpose of the field.
Why existing code/process doesn't catch this
There is no schema validation that enforces pr-link to be a URL — XXX is accepted as a valid YAML string. The placeholder is presumably an artifact of the entry being drafted before the PR number was known, and it was never substituted back in.
Step-by-step proof
- Check out this PR locally and open
perf-changelog.yaml. - Jump to line 2556 — observe
pr-link: XXX. - Scroll up to any other recent entry (e.g., line 2550) — observe
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1402. - The diff confirms the same: the only change to
perf-changelog.yamlappends a 6-line entry whose final line ispr-link: XXX.
Fix
Replace XXX with https://github.com/SemiAnalysisAI/InferenceX/pull/1424:
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1424This is a small documentation/hygiene fix with no runtime impact, but it should be done before merge so the changelog continues to serve as a useful index.
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25980020107 |
|
/reuse-sweep-run |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25984286545 |
Updates SGLang image for
qwen3.5-fp8-h200-sglang-mtpfrom v0.5.11 to v0.5.12.\nRef #1154
Generated with Claude Code