Skip to content

[Cherry-Pick][Benchmark] Add inner benchmark metrics component (#7881)#7831

Merged
Jiang-Jia-Jun merged 5 commits into
PaddlePaddle:release/2.6from
Deleter-D:2.6_inner_benchmark
May 22, 2026
Merged

[Cherry-Pick][Benchmark] Add inner benchmark metrics component (#7881)#7831
Jiang-Jia-Jun merged 5 commits into
PaddlePaddle:release/2.6from
Deleter-D:2.6_inner_benchmark

Conversation

@Deleter-D
Copy link
Copy Markdown
Collaborator

@Deleter-D Deleter-D commented May 15, 2026

Motivation

为 FastDeploy 添加进程内性能监控模块(Benchmark Metrics Logger),在推理进程内部实时采集每请求延迟数据,以滑动窗口或翻滚窗口模式计算聚合统计指标(TTFT、TPOT、E2EL、吞吐量等),写入 JSONL 文件供实时监控与事后分析,无需依赖外部 benchmark 工具。

Modifications

  • fastdeploy/config.py:新增 BenchmarkMetricsConfig 配置类;在 FDConfig 中添加 benchmark_metrics_config 字段,并在 FDConfig.check() 中补充参数合法性校验
  • fastdeploy/engine/args_utils.py:新增 --benchmark-metrics-config CLI 参数及 create_benchmark_metrics_config() 方法
  • fastdeploy/engine/common_engine.py:引擎初始化时按配置实例化 BenchmarkMetricsLogger 并注入 token_processor
  • fastdeploy/metrics/benchmark_metrics_logger.py:新增 BenchmarkMetricsLogger,后台 daemon 线程 + deque 窗口,支持滑动/翻滚两种模式
  • fastdeploy/output/token_processor.py:在 _record_metrics 中收集 ITL 样本,在 _record_completion_metrics 中组装 CompletedRequestRecord 并回调 logger
  • docs/benchmark.md / docs/zh/benchmark.md:新增进程内监控模块中英文文档

Usage or Command

python -m fastdeploy.entrypoints.openai.api_server
--model baidu/ERNIE-4.5-0.3B-Base-Paddle
--benchmark-metrics-config '{"enable": true, "window_size": 64, "window_mode": "sliding"}'

查看最新统计快照:tail -1 log/benchmark_metrics.jsonl | python -m json.tool

Accuracy Tests

N/A(本 PR 为性能监控功能,不涉及模型前向推理逻辑变更,不影响模型输出精度)

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

PaddlePaddle-bot

This comment was marked as outdated.

@PaddlePaddle-bot
Copy link
Copy Markdown

PaddlePaddle-bot commented May 15, 2026

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-21 06:21:21

CI报告基于以下代码生成(30分钟更新一次):


1 任务总览

当前 PR 的 Required CI 尚未全部通过:Required 任务 10 个,已通过 9 个,失败 1 个。阻塞项为 Approval,需要人工审批;另有 3 个 Optional 任务失败、1 个 Optional 任务等待中,仅供参考。

总执行(rerun次数) 总任务 ✅ 通过 ❌ 失败 ⏳ 运行中 ⏸️ 等待中 跳过
37(0) 37 32 4 0 1 0

2 任务状态汇总

日志列说明:失败任务直接使用日志链接;运行中任务使用 Job 链接。

2.1 Required任务 : 9/10 通过

必选任务阻塞合并,失败需优先处理。

状态 任务 耗时 根因 修复建议 日志 重跑
Approval 8s 需要 Approval 请通过人工审批 Job -
其余 9 个必选任务通过 - - - - -

2.2 可选任务 — 23/27 通过

可选任务不阻塞合并,失败仅供参考。

状态 任务 耗时 日志 重跑
Check PR Template 17s Job -
CI_HPU 1h3m Job -
Trigger Jenkins for PR 1m5s Job -
⏸️ Run iluvatar Tests / run_iluvatar_cases - - -
其余 23 个可选任务通过 - - -

3 失败详情(仅 required)

Approval — 需要人工审批(置信度: 高)

该 Job 需要人工 Approval,完成审批后 CI 才会继续执行。

  • 根因摘要:Required 的 Approval workflow 未通过,当前状态表示需要人工审批。
  • 修复建议摘要:请有权限的维护者在 GitHub Actions/Checks 页面通过审批,然后等待后续 CI 执行。

@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented May 15, 2026

Thanks for your contribution!

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 15, 2026

Codecov Report

❌ Patch coverage is 88.72549% with 23 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (release/2.6@8c4f5a6). Learn more about missing BASE report.

Files with missing lines Patch % Lines
fastdeploy/metrics/benchmark_metrics_logger.py 93.15% 4 Missing and 6 partials ⚠️
fastdeploy/output/token_processor.py 72.22% 3 Missing and 2 partials ⚠️
fastdeploy/engine/common_engine.py 0.00% 3 Missing and 1 partial ⚠️
fastdeploy/engine/args_utils.py 66.66% 2 Missing and 1 partial ⚠️
fastdeploy/config.py 96.29% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@              Coverage Diff               @@
##             release/2.6    #7831   +/-   ##
==============================================
  Coverage               ?   72.26%           
==============================================
  Files                  ?      382           
  Lines                  ?    54419           
  Branches               ?     8513           
==============================================
  Hits                   ?    39328           
  Misses                 ?    12309           
  Partials               ?     2782           
Flag Coverage Δ
GPU 72.26% <88.72%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

PaddlePaddle-bot

This comment was marked as outdated.

PaddlePaddle-bot

This comment was marked as outdated.

PaddlePaddle-bot

This comment was marked as outdated.

Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Paddle-CI-Agent | pr_review | 2026-05-20 19:39:57

📋 Review 摘要

PR 概述:新增进程内 Benchmark Metrics Logger,通过后台线程采集每请求延迟指标,支持滑动/翻滚窗口聚合统计并写入 JSONL 文件
变更范围fastdeploy/config.pyfastdeploy/metrics/fastdeploy/output/token_processor.pyfastdeploy/engine/docs/
影响面 Tag[FDConfig] [Engine] [Benchmark] [DataProcessor] [Docs]

问题

级别 文件 概述
🔴 Bug fastdeploy/config.py:2461 assert 用于运行时配置校验,python -O 下会被静默跳过
🟡 建议 fastdeploy/metrics/benchmark_metrics_logger.py:116 datetime.now() 无时区,分布式/跨时区环境时间戳语义模糊
📝 PR 规范 标题缺少官方 Tag,目标分支为 release/2.6 须遵循 Cherry-Pick 格式;描述各段落为空

📝 PR 规范检查

PR 标题 Add inner benchmark metrics component 缺少官方 Tag;目标分支为 release/2.6(非 develop),按规范须采用 Cherry-Pick 格式并附原 PR 号;描述中 Motivation / Modifications / Usage or Command / Accuracy Tests 各段均为空,Checklist 未勾选。

标题建议(可直接复制):

  • [Cherry-Pick][Benchmark] Add inner benchmark metrics component (#<develop_PR_号>)

PR 描述建议(可直接复制,必须复刻 checklist §D2 模板的完整结构):

## Motivation
为 FastDeploy 添加进程内性能监控模块(Benchmark Metrics Logger),在推理进程内部实时采集每请求延迟数据,以滑动窗口或翻滚窗口模式计算聚合统计指标(TTFT、TPOT、E2EL、吞吐量等),写入 JSONL 文件供实时监控与事后分析,无需依赖外部 benchmark 工具。

## Modifications
- `fastdeploy/config.py`:新增 `BenchmarkMetricsConfig` 配置类;在 `FDConfig` 中添加 `benchmark_metrics_config` 字段,并在 `FDConfig.check()` 中补充参数合法性校验
- `fastdeploy/engine/args_utils.py`:新增 `--benchmark-metrics-config` CLI 参数及 `create_benchmark_metrics_config()` 方法
- `fastdeploy/engine/common_engine.py`:引擎初始化时按配置实例化 `BenchmarkMetricsLogger` 并注入 `token_processor`
- `fastdeploy/metrics/benchmark_metrics_logger.py`:新增 `BenchmarkMetricsLogger`,后台 daemon 线程 + deque 窗口,支持滑动/翻滚两种模式
- `fastdeploy/output/token_processor.py`:在 `_record_metrics` 中收集 ITL 样本,在 `_record_completion_metrics` 中组装 `CompletedRequestRecord` 并回调 logger
- `docs/benchmark.md` / `docs/zh/benchmark.md`:新增进程内监控模块中英文文档

## Usage or Command
python -m fastdeploy.entrypoints.openai.api_server \
       --model baidu/ERNIE-4.5-0.3B-Base-Paddle \
       --benchmark-metrics-config '{"enable": true, "window_size": 64, "window_mode": "sliding"}'

查看最新统计快照:tail -1 log/benchmark_metrics.jsonl | python -m json.tool

## Accuracy Tests
N/A(本 PR 为性能监控功能,不涉及模型前向推理逻辑变更,不影响模型输出精度)

## Checklist

- [x] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [x] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

总体评价

功能实现完整、逻辑清晰、单测覆盖充分。需修复 FDConfig.check()assert 校验在 python -O 模式下失效的风险,并完善 PR 标题格式与描述内容。

Comment thread fastdeploy/config.py

if self.benchmark_metrics_config is not None:
cfg = self.benchmark_metrics_config
assert isinstance(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug assert 被用于运行时配置校验,在 python -O(优化模式)下所有 assert 语句会被静默忽略,导致非法配置值绕过校验进入推理流程。

建议将此处所有 assert 替换为显式 raise ValueError

if not isinstance(cfg.enable, bool):
    raise ValueError(f"BenchmarkMetricsConfig: 'enable' must be a bool, got {type(cfg.enable).__name__}")
if not (isinstance(cfg.window_size, int) and cfg.window_size >= 0):
    raise ValueError(f"BenchmarkMetricsConfig: 'window_size' must be a non-negative integer, got {cfg.window_size!r}")
if cfg.window_mode not in ("sliding", "tumbling"):
    raise ValueError(f"BenchmarkMetricsConfig: 'window_mode' must be 'sliding' or 'tumbling', got {cfg.window_mode!r}")
# ... 其余校验同理

records = list(self._window)
n = len(records)
if n == 0:
return {"timestamp": datetime.now().isoformat(), "completed": 0}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 建议 使用 datetime.now() 输出本地时间,在多节点/跨时区分布式环境中时间戳含义不明确,建议改用 UTC:

from datetime import timezone
# 将两处 datetime.now().isoformat() 替换为:
datetime.now(timezone.utc).isoformat()

两处均需修改:n==0 提前返回处(第 116 行)和 result 字典构造处(第 164 行)。

@PaddlePaddle-bot
Copy link
Copy Markdown

PaddlePaddle-bot commented May 20, 2026

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-21 12:05:12

CI报告基于以下代码生成(30分钟更新一次):


1 任务总览

当前 Required 任务 10 个,通过 9 个,失败 1 个;失败项为 Approval,需要人工审批后 CI 才可继续满足合入要求。另有 4 个 Optional 任务失败,仅供参考,不阻塞 required 合入判断。

总执行(rerun次数) 总任务 ✅ 通过 ❌ 失败 ⏳ 运行中 ⏸️ 等待中 跳过
37(0) 37 32 5 0 0 0

2 任务状态汇总

日志列说明:失败任务直接使用预生成日志链接;运行中任务使用 Job 链接。

2.1 Required任务 : 9/10 通过

必选任务阻塞合并,失败需优先处理。

状态 任务 耗时 根因 修复建议 日志 重跑
Approval 8s 需要 Approval 请通过人工审批 Job -
其余 9 个必选任务通过 - - - - -

2.2 可选任务 — 23/27 通过

可选任务不阻塞合并,失败仅供参考。

状态 任务 耗时 日志 重跑
Run iluvatar Tests / run_iluvatar_cases 15m22s Job -
Check PR Template 17s Job -
CI_HPU 1h3m Job -
Trigger Jenkins for PR 1m5s Job -
其余 23 个可选任务通过 - - -

3 失败详情(仅 required)

Approval — 需要审批(置信度: 高)

该 Job 需要人工 Approval,完成审批后 CI 才会继续执行。

  • 根因摘要:Workflow 等待人工 Approval。
  • 修复建议摘要:请有权限的维护者通过人工审批。

@Deleter-D Deleter-D changed the title Add inner benchmark metrics component [Cherry-Pick][Benchmark] Add inner benchmark metrics component (#7881) May 21, 2026
@Jiang-Jia-Jun Jiang-Jia-Jun merged commit e7815be into PaddlePaddle:release/2.6 May 22, 2026
33 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants