Update gptoss-fp4-h100-vllm vLLM image to v0.21.0#1398
Conversation
Ref #1154 Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
| - gptoss-fp4-h100-vllm | ||
| description: | ||
| - "Update vLLM image from v0.20.2 to v0.21.0" | ||
| pr-link: XXX |
There was a problem hiding this comment.
🟡 The new changelog entry at perf-changelog.yaml:2501 has pr-link: XXX as a placeholder rather than a real PR URL. Every other entry in this file uses a fully qualified GitHub PR URL — please replace XXX with https://github.com/SemiAnalysisAI/InferenceX/pull/1398 before merging so the changelog retains traceability back to this PR.
Extended reasoning...
What the bug is\n\nThe diff appends a new entry to perf-changelog.yaml for the gptoss-fp4-h100-vllm image bump. The entry ends with:\n\nyaml\n- config-keys:\n - gptoss-fp4-h100-vllm\n description:\n - "Update vLLM image from v0.20.2 to v0.21.0"\n pr-link: XXX\n\n\nThe literal string XXX is a placeholder that was never substituted with the actual PR URL.\n\nWhy this is inconsistent\n\nEvery other entry in perf-changelog.yaml uses a fully qualified GitHub PR URL. For example, the immediately preceding entry at line 2495 uses pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1271, and the same pattern holds for the entries at lines 2476, 2482, and 2488. The AGENTS.md template documents the expected format as pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX — note that only the trailing number is a placeholder, not the entire URL.\n\nWhy existing code does not catch this\n\nThe Pydantic validator for changelog entries (utils/matrix_logic/validation.py) only requires pr_link to be a string; there is no regex or URL-shape check. So XXX validates fine and CI does not flag it.\n\nImpact\n\nThis does not affect runtime behaviour — it is a changelog/metadata hygiene issue. The cost is that, once merged, the entry loses its traceability back to the originating PR: a reader of the changelog cannot follow XXX to discover the context, discussion, or commit history for this image bump.\n\nHow to fix\n\nReplace line 2501 with:\n\nyaml\n pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1398\n\n\nStep-by-step proof\n\n1. The PR description identifies this as PR #1398.\n2. The diff adds a single new changelog entry whose final line is pr-link: XXX.\n3. Grepping perf-changelog.yaml shows all preceding pr-link: values are https://github.com/SemiAnalysisAI/InferenceX/pull/<num>-shaped — XXX is the only entry that is not a URL.\n4. The AGENTS.md template (line 123) confirms the intended format is pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX, where only XXX is meant to be replaced with the PR number.\n5. Since the new entry uses the bare token XXX (no URL prefix at all), the author clearly forgot to fill it in. The fix is to write https://github.com/SemiAnalysisAI/InferenceX/pull/1398.
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25956497716 |
1 similar comment
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25956497716 |
|
/reuse-sweep-run |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25984664563 |
Summary
gptoss-fp4-h100-vllmfrom v0.20.2 to v0.21.0.Ref #1154
Generated with Claude Code