fix(api): forward extra_body.chat_template_kwargs on /v1/messages by sufubao · Pull Request #1276 · ModelTC/LightLLM

sufubao · 2026-04-18T06:41:35Z

Summary

The /v1/messages request translator strips extra_body before building the ChatCompletionRequest, so clients cannot reach LightLLM fields that Anthropic's schema does not expose — most notably chat_template_kwargs. On engines where thinking defaults to on (Qwen3, DeepSeek), this means Anthropic-protocol callers have no way to disable thinking; with short max_tokens the response degenerates to content: [] because every output token went into the <think> block.
Merge any dict-valued extra_body into the translated openai_dict before the _UNKNOWN_FIELDS cleanup, with setdefault so fields produced by the Anthropic→OpenAI translation keep precedence. Unknown keys fall off naturally in ChatCompletionRequest(**chat_dict) since the Pydantic default is extra='ignore'.

Usage example for clients:

{
  \"model\": \"...\",
  \"max_tokens\": 512,
  \"messages\": [...],
  \"extra_body\": { \"chat_template_kwargs\": { \"enable_thinking\": false } }
}

Test plan

pytest test/test_api/test_anthropic_extra_body.py — 5 new unit tests covering: single field, multi-field, top-level field beats extra_body duplicate, missing extra_body is no-op, non-dict extra_body is ignored.
black + flake8 pre-commit hooks pass.
Manual: send an Anthropic request with extra_body.chat_template_kwargs.enable_thinking=false to a Qwen3 deployment and verify content is non-empty.

The Anthropic Messages request translator in _anthropic_to_chat_request unconditionally strips extra_body, so clients had no way to pass LightLLM-specific ChatCompletionRequest options through the /v1/messages endpoint. Models that default thinking on (Qwen3, DeepSeek) could not be told to skip thinking, and with short max_tokens this yielded empty content: [] responses because all output tokens went to the <think> block. Merge any dict-valued extra_body into the translated openai_dict before the _UNKNOWN_FIELDS cleanup, with setdefault so fields produced by the Anthropic->OpenAI translation keep precedence. Unknown keys are dropped by Pydantic (extra='ignore' default), so this does not risk validation errors for arbitrary SDK pass-through. Unit test locks in the four shapes: single field, multiple fields, top-level override wins, missing extra_body is a no-op, and non-dict extra_body is ignored.

gemini-code-assist

Code Review

This pull request introduces support for the extra_body field in the Anthropic-to-OpenAI request translation, enabling the forwarding of LightLLM-specific parameters such as chat_template_kwargs. The logic correctly prioritizes existing translated fields and ignores non-dictionary extra_body inputs. Comprehensive unit tests have been included to validate field forwarding, precedence rules, and edge cases. I have no feedback to provide.

gemini-code-assist bot reviewed Apr 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(api): forward extra_body.chat_template_kwargs on /v1/messages#1276

fix(api): forward extra_body.chat_template_kwargs on /v1/messages#1276
sufubao wants to merge 1 commit intoModelTC:mainfrom
sufubao:fix_anthropic_extra_body

sufubao commented Apr 18, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sufubao commented Apr 18, 2026

Summary

Test plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant