[Klaud Cold] Update dsr1-fp4-b200-trt (+mtp) TRT-LLM image to v1.3.0rc14 by functionstackx · Pull Request #1489 · SemiAnalysisAI/InferenceX

functionstackx · 2026-05-18T06:16:27Z

Summary

Update TensorRT-LLM image (off: v1.2.0rc6.post2 104d / mtp: v1.2.0rc6.post3 101d) to v1.3.0rc14 (latest pre-release)

dsr1-fp4-b200-trt: nvcr.io#nvidia/tensorrt-llm/release:1.2.0rc6.post2 → nvcr.io#nvidia/tensorrt-llm/release:1.3.0rc14
dsr1-fp4-b200-trt-mtp: nvcr.io#nvidia/tensorrt-llm/release:1.2.0rc6.post3 → nvcr.io#nvidia/tensorrt-llm/release:1.3.0rc14

Test plan

full-sweep-enabled sweep passes.

🤖 Generated with Claude Code

Update TensorRT-LLM image (off: v1.2.0rc6.post2 104d / mtp: v1.2.0rc6.post3 101d) to v1.3.0rc14 (latest pre-release) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-18T06:16:36Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-18T06:16:36Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-18T06:16:36Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-18T06:17:02Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26016878043
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26016878043

claude

LGTM — mechanical image version bump with matching changelog entry.

Extended reasoning...

Overview

This PR bumps the TensorRT-LLM image for two recipes (dsr1-fp4-b200-trt and dsr1-fp4-b200-trt-mtp) from v1.2.0rc6.post2/post3 to v1.3.0rc14 in .github/configs/nvidia-master.yaml, plus a corresponding perf-changelog.yaml entry referencing this PR. Total changeset is 4 line edits.

Security risks

None. These are container image tags in a benchmark configuration file pointing to the official NVIDIA NGC registry (nvcr.io#nvidia/tensorrt-llm/release). No auth, crypto, permission, or user-input handling is touched.

Level of scrutiny

Low. This is a routine [Klaud Cold] image version bump — the same pattern as the immediately preceding merged commits (e.g., #1426, #1395, #1448). The full-sweep-enabled label indicates the performance sweep will validate the new image end-to-end, which is the meaningful gate here, not code review.

Other factors

The bug hunting system reported no findings. Both image references are updated consistently to the same target tag, and the changelog entry accurately describes the change and links back to this PR.

Update dsr1-fp4-b200-trt (+mtp) TRT-LLM image to v1.3.0rc14

0064b80

Update TensorRT-LLM image (off: v1.2.0rc6.post2 104d / mtp: v1.2.0rc6.post3 101d) to v1.3.0rc14 (latest pre-release) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

functionstackx requested a review from a team May 18, 2026 06:16

functionstackx added the full-sweep-enabled label May 18, 2026

functionstackx requested review from jgangani and kedarpotdar-nv as code owners May 18, 2026 06:16

github-project-automation Bot added this to InferenceMAX Board May 18, 2026

chore: fill pr-link for #1489

8339f64

claude Bot reviewed May 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Klaud Cold] Update dsr1-fp4-b200-trt (+mtp) TRT-LLM image to v1.3.0rc14#1489

[Klaud Cold] Update dsr1-fp4-b200-trt (+mtp) TRT-LLM image to v1.3.0rc14#1489
functionstackx wants to merge 2 commits into
mainfrom
update-dsr1-fp4-b200-trt-v1.3.0rc14

functionstackx commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

claude Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

functionstackx commented May 18, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant