[AMD][ROCM] gptoss-fp4-mi355x-atom: Bump image to rocm/atom:rocm7.2.3_ubuntu24.04_py3.12_pytorch_release_2.10.0_atom20260511 by seungrokj · Pull Request #1412 · SemiAnalysisAI/InferenceX

seungrokj · 2026-05-16T13:55:43Z

Summary

Bump ATOM image from rocm/atom:rocm7.2.2_ubuntu24.04_py3.12_pytorch_release_2.10.0_atom0.1.2.post to rocm/atom-dev:nightly_202605111702
Add perf-changelog entry for gptoss-fp4-mi355x-atom

Performance (ATOM Upstream vs InferenceX baseline, TP=1, 1 GPU, fp4, MI355X)

ISL	OSL	Conc	InferenceX (tok/s)	ATOM Upstream (tok/s)	Diff %
1024	1024	16	4808.49	5757.55	+19.7%
1024	1024	32	7537.19	8869.83	+17.7%
1024	1024	64	11737.08	13566.85	+15.6%
1024	1024	128	17577.22	19234.78	+9.4%
8192	1024	4	7563.76	8183.21	+8.2%
8192	1024	8	12188.56	13409.95	+10.0%
8192	1024	16	18128.47	20788.08	+14.7%
8192	1024	32	25815.80	29588.12	+14.6%
8192	1024	64	35913.41	40886.65	+13.8%
8192	1024	128	44999.29	50888.02	+13.1%

InferenceX baseline: rocm/atom:rocm7.1.1-ubuntu24.04-pytorch2.9-atom0.1.1-MI350x (2026-01-14)
ATOM upstream run: https://github.com/ROCm/ATOM/actions/runs/25686894636 (2026-05-11)

Test plan

Verify benchmark runs with new image on mi355x runner
Verify throughput improvement vs baseline

🤖 Generated with Claude Code

…ly_202605111702

github-actions · 2026-05-16T13:55:50Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-16T13:55:50Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-16T13:55:50Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-16T13:55:51Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

…1412

claude · 2026-05-16T13:59:49Z

+  description:
+    - "Bump ATOM image from rocm/atom:rocm7.2.2_ubuntu24.04_py3.12_pytorch_release_2.10.0_atom0.1.2.post to rocm/atom-dev:nightly_202605111702"
+    - "ATOM upstream benchmark shows +8% to +20% throughput improvement vs InferenceX baselines (1 GPU, ISL=1024/8192, OSL=1024)"
+  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/PLACEHOLDER


🔴 The new perf-changelog entry's pr-link is https://github.com/SemiAnalysisAI/InferenceX/pull/PLACEHOLDER — the template token was never substituted with the actual PR number (1412). The link 404s as written and breaks any tooling that parses these entries to map config-keys to PRs. Fix by replacing PLACEHOLDER with 1412 before merge.

Extended reasoning...

What the bug is

perf-changelog.yaml:2502 contains:

pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/PLACEHOLDER

The literal string PLACEHOLDER was clearly meant to be substituted with this PR's number (1412) before opening the PR, but the substitution never happened. Per AGENTS.md, the template token for this field is XXX (e.g. pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX); the author appears to have hand-edited the template to use PLACEHOLDER instead, then forgot to swap in the real number.

Why existing code doesn't prevent it

There is no schema validator or pre-merge check on perf-changelog.yaml that asserts pr-link ends in a numeric path component. All 295 other entries in the file use real PR IDs (e.g. the immediately preceding entry uses /pull/1271), but that's convention, not enforcement.

Impact

The URL https://github.com/SemiAnalysisAI/InferenceX/pull/PLACEHOLDER 404s — anyone clicking through from the changelog entry to find the PR/discussion that introduced the gptoss-fp4-mi355x-atom image bump won't reach it.

Any tooling that parses these entries to compute a config-key → PR-number mapping (e.g. for changelog rendering, regression bisection, attribution) will either crash on the non-integer suffix or silently record a bogus mapping.

How to fix

Replace the literal PLACEHOLDER on line 2502 with 1412 (this PR's number):

pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1412

Step-by-step proof

PR metadata shows this is PR #1412 (<pr number="1412"> in the PR header).

perf-changelog.yaml:2502 in the diff contains pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/PLACEHOLDER.

Constructing the URL: https://github.com/SemiAnalysisAI/InferenceX/pull/PLACEHOLDER — GitHub's pull-request URL parser expects an integer at that path position; PLACEHOLDER is not an integer, so the route returns 404.

Compare to the immediately preceding entry at line 2493–2496 which correctly uses pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1271 — that link resolves to the actual PR.

The correct value is therefore /pull/1412, and the fix is a one-character-region edit on line 2502.

seungrokj added 2 commits May 16, 2026 22:54

[AMD][ROCM] gptoss-fp4-mi355x-atom: bump image to rocm/atom-dev:night…

435c698

…ly_202605111702

[AMD][ROCM] gptoss-fp4-mi355x-atom: add perf-changelog entry

58fa321

seungrokj requested a review from a team May 16, 2026 13:55

seungrokj requested review from 1am9trash, billishyahao, chunfangamd and yctseng0211 as code owners May 16, 2026 13:55

github-project-automation Bot added this to InferenceMAX Board May 16, 2026

[AMD][ROCM] gptoss-fp4-mi355x-atom: update perf-changelog PR link to #…

4d7f4ad

…1412

seungrokj changed the title ~~[AMD][ROCM] gptoss-fp4-mi355x-atom: bump ATOM image to nightly_202605111702~~ [AMD][ROCM] gptoss-fp4-mi355x-atom: Bump image to rocm/atom:rocm7.2.3_ubuntu24.04_py3.12_pytorch_release_2.10.0_atom20260511 May 16, 2026

seungrokj added 2 commits May 16, 2026 22:57

Update amd-master.yaml

f4e5b0e

Update perf-changelog.yaml

2dcb5b3

claude Bot reviewed May 16, 2026

View reviewed changes

seungrokj added the AMD label May 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD][ROCM] gptoss-fp4-mi355x-atom: Bump image to rocm/atom:rocm7.2.3_ubuntu24.04_py3.12_pytorch_release_2.10.0_atom20260511#1412

[AMD][ROCM] gptoss-fp4-mi355x-atom: Bump image to rocm/atom:rocm7.2.3_ubuntu24.04_py3.12_pytorch_release_2.10.0_atom20260511#1412
seungrokj wants to merge 5 commits into
mainfrom
srok/atom_gptoss_fp4_mi355x

seungrokj commented May 16, 2026

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

claude Bot May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

seungrokj commented May 16, 2026

Summary

Performance (ATOM Upstream vs InferenceX baseline, TP=1, 1 GPU, fp4, MI355X)

Test plan

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

claude Bot May 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant