[Cherry-Pick] [FDConfig] Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compatibility(#7509)#7517
Conversation
|
Thanks for your contribution! |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2026-04-21 12:16:15
📋 Review 摘要
PR 概述:将 MoE 模型 config 中的 num_experts_per_tok 统一映射到 moe_k,使 ERNIE 4.5 MoE 等使用 moe_k 的模型能兼容 Qwen3VLMOE、DeepSeek V3 等使用 num_experts_per_tok 的模型配置。
变更范围:fastdeploy/config.py — ModelConfig.override_name_from_config() 方法
影响面 Tag:[FDConfig]
问题
未发现阻塞性问题。
总体评价
变更简洁、安全,与现有的 num_experts → moe_num_experts、n_routed_experts → moe_num_experts 等字段映射模式保持一致。hasattr 双重检查保证了向后兼容性——仅在 config.json 包含 num_experts_per_tok 且缺少 moe_k 时才做映射,不会影响已有模型的加载行为。
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## release/2.6 #7517 +/- ##
==============================================
Coverage ? 73.57%
==============================================
Files ? 376
Lines ? 53000
Branches ? 8280
==============================================
Hits ? 38995
Misses ? 11260
Partials ? 2745
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
95261f0
into
PaddlePaddle:release/2.6
Motivation
Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compatibility. This enables R3 support for models like Qwen3VLMOE, DeepSeek V3, and other MoE models that use num_experts_per_tok instead of moe_k in their config.
Modifications
Usage or Command
N/A
Accuracy Tests
N/A
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.