Update gptoss-fp4-mi300x-vllm vLLM ROCm image to v0.21.0 by Klaud-Cold · Pull Request #1403 · SemiAnalysisAI/InferenceX

Klaud-Cold · 2026-05-16T07:45:41Z

Summary

Updates the vLLM ROCm image tag for gptoss-fp4-mi300x-vllm from v0.17.0 to v0.21.0.

…Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>

claude · 2026-05-16T07:49:26Z

+- config-keys:
+    - gptoss-fp4-mi300x-vllm
+  description:
+    - "Update vLLM ROCm image from v0.17.0 to v0.20.2"
+  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX


🔴 The new perf-changelog entry at line 2351 uses pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX with a literal XXX placeholder that was never replaced with the actual PR number. After merge this link will 404 and the changelog entry will be permanently broken. Please update to /pull/1403 before merging.

Extended reasoning...

The bug

The new perf-changelog entry added in this PR uses a literal XXX placeholder in its pr-link URL:

- config-keys: - gptoss-fp4-mi300x-vllm description: - "Update vLLM ROCm image from v0.17.0 to v0.20.2" pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX

The XXX is clearly the unfilled template placeholder from AGENTS.md (which uses pull/XXX as a fill-in-the-blank example). The author copied the template and forgot to substitute the real PR number.

Why this matters

Every other entry in perf-changelog.yaml uses a real numeric PR id. The five immediately preceding entries (lines 2318, 2325, 2332, 2338, 2345) reference /pull/1303, /pull/1304, /pull/1305, /pull/1308, and /pull/1310 respectively. The changelog is meant to provide traceability from each config-change entry back to the PR that introduced it, and that traceability is broken here.

Impact after merge

https://github.com/SemiAnalysisAI/InferenceX/pull/XXX is not a valid GitHub URL — GitHub returns a 404 for non-numeric PR ids. Once this is merged to main, the changelog entry will permanently link to a dead page, and a follow-up PR will be required to repair it.

Step-by-step proof

Look at the diff hunk for perf-changelog.yaml: the new line literally reads pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX.

Check the immediately preceding entries in the file — every pr-link value is a numeric URL like .../pull/1310.

The PR metadata says this PR is #1403. So the correct value is https://github.com/SemiAnalysisAI/InferenceX/pull/1403.

Visit https://github.com/SemiAnalysisAI/InferenceX/pull/XXX — GitHub returns 404, since PR ids must be integers.

Fix

Replace the XXX with 1403:

pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1403

This is a trivial pre-merge fix; doing it post-merge requires a follow-up PR.

claude · 2026-05-16T07:49:26Z


 gptoss-fp4-mi300x-vllm:
-  image: vllm/vllm-openai-rocm:v0.17.0
+  image: vllm/vllm-openai-rocm:v0.20.2


🟡 The PR title says "Update gptoss-fp4-mi300x-vllm vLLM ROCm image to v0.21.0" and the description says the update is from v0.17.0 to v0.21.0, but the diff (this line and the perf-changelog entry) actually pins the image to v0.20.2. The yaml, changelog, and commit message (91b6c49) are internally consistent on v0.20.2, so this looks like a stale PR title/description — please either amend the title/description to v0.20.2, or bump the image tag to v0.21.0 if that was the intent.

Extended reasoning...

What's inconsistent

The PR metadata and the merged content disagree about which vLLM ROCm image tag is being adopted for gptoss-fp4-mi300x-vllm:

PR title: "Update gptoss-fp4-mi300x-vllm vLLM ROCm image to v0.21.0"

PR description: "Updates the vLLM ROCm image tag for gptoss-fp4-mi300x-vllm from v0.17.0 to v0.21.0."

Diff at .github/configs/amd-master.yaml:705: image: vllm/vllm-openai-rocm:v0.20.2

Diff at perf-changelog.yaml: "Update vLLM ROCm image from v0.17.0 to v0.20.2"

Commit message (91b6c49): "Update gptoss-fp4-mi300x-vllm vLLM image to v0.20.2"

The yaml change, the changelog entry, and the commit message all agree on v0.20.2. Only the PR title and PR description mention v0.21.0.

Step-by-step proof

Open .github/configs/amd-master.yaml at the gptoss-fp4-mi300x-vllm block (line 704). Before this PR the image was vllm/vllm-openai-rocm:v0.17.0; the diff replaces it with vllm/vllm-openai-rocm:v0.20.2.

Open perf-changelog.yaml at the new entry (line 2347+). The description string says "Update vLLM ROCm image from v0.17.0 to v0.20.2" — again v0.20.2.

Compare against the PR title and description on GitHub — both say v0.21.0. They do not match the diff.

Impact

This is a metadata-only inconsistency. Once merged, the source of truth is the yaml, so runtime behavior will use v0.20.2 (as intended by the diff). However, reviewers/release-note readers scanning PR titles will be misled about which version is being pinned, and anyone bisecting later via PR title search will not find v0.20.2.

How to fix

Either:

Amend the PR title to "Update gptoss-fp4-mi300x-vllm vLLM ROCm image to v0.20.2" and update the description to from v0.17.0 to v0.20.2 (matches the diff and commit), or

If v0.21.0 was actually intended, change line 705 to vllm/vllm-openai-rocm:v0.21.0 and update the changelog entry accordingly.

Filing as nit because the merged code is self-consistent and only the PR metadata is wrong — but worth clarifying intent before merge in case v0.21.0 was the real target.

vLLM 0.20.2's CUDA-graph memory profiler is more aggressive than 0.17.0's, and on MI300X tp=8 the previous 0.95 setting left no headroom: warmup passes, then HSA_STATUS_ERROR_OUT_OF_RESOURCES ('Available Free mem: 0 MB') fires on every queue once the main benchmark starts, killing the EngineCore and causing all 40 prompts to fail. 0.90 matches the H100/H200/B200 scripts and gives ~20 GB headroom on MI300X. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # perf-changelog.yaml

github-actions · 2026-05-17T08:42:37Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25984515703
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25984515703

$Update gptoss-fp4-mi300x-vllm vLLM image to v0.20.2\n\nRef #1154\n\n…

91b6c49

…Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>

Klaud-Cold added the full-sweep-enabled label May 16, 2026

Klaud-Cold requested a review from a team May 16, 2026 07:45

Klaud-Cold requested a review from billishyahao as a code owner May 16, 2026 07:45

Klaud-Cold added the full-sweep-enabled label May 16, 2026

Klaud-Cold requested review from 1am9trash, chunfangamd, seungrokj and yctseng0211 as code owners May 16, 2026 07:45

github-project-automation Bot added this to InferenceMAX Board May 16, 2026

Klaud-Cold mentioned this pull request May 16, 2026

[Auto] Docker Image Updates Available - 2026-04-25 #1154

Open

claude Bot reviewed May 16, 2026

View reviewed changes

claude-fix-bot and others added 2 commits May 17, 2026 03:04

Merge remote-tracking branch 'origin/main' into HEAD

d31733a

# Conflicts: # perf-changelog.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update gptoss-fp4-mi300x-vllm vLLM ROCm image to v0.21.0#1403

Update gptoss-fp4-mi300x-vllm vLLM ROCm image to v0.21.0#1403
Klaud-Cold wants to merge 3 commits into
mainfrom
claude/issue-1154-gptoss-fp4-mi300x-vllm

Klaud-Cold commented May 16, 2026

Uh oh!

claude Bot May 16, 2026

Uh oh!

claude Bot May 16, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Klaud-Cold commented May 16, 2026

Summary

Uh oh!

claude Bot May 16, 2026

Choose a reason for hiding this comment

The bug

Why this matters

Impact after merge

Step-by-step proof

Fix

Uh oh!

claude Bot May 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant