Skip to content

[Rust] Enforce 10 MiB client-side payload limit on ingest calls#331

Open
elenagaljak-db wants to merge 2 commits into
mainfrom
add_max_message_limit
Open

[Rust] Enforce 10 MiB client-side payload limit on ingest calls#331
elenagaljak-db wants to merge 2 commits into
mainfrom
add_max_message_limit

Conversation

@elenagaljak-db

Copy link
Copy Markdown
Contributor

What changes are proposed in this pull request?

Enforce the Zerobus server's 10 MiB payload limit client-side on ingest_record_offset and ingest_records_offset. Without this, an oversized payload travels through the landing zone and gRPC layer before being rejected by the server. With this change, the SDK returns ZerobusError::InvalidArgument immediately, before any network I/O.

The limit applies to the total encoded byte size of a single call (sum of all record bytes in the batch). It does not apply to Arrow Flight (ingest_batch / ingest_ipc_batch), which has a separate code path.

Perf impact: measured with Criterion: the check costs ~3 ns for a single record and ~20 ns for a 100-record batch (reading .len() from a SmallVec, no allocation). Well within the noise of the mutex + gRPC overhead.

Changes

  • sdk/src/lib.rs — module-level MAX_INGEST_PAYLOAD_BYTES constant (10 MiB) with a note that it mirrors the server limit; check at the top of ingest_internal_v2
  • sdk/src/record_types.rsEncodedBatch::total_byte_size() (pub) + unit tests
  • tests/src/rust_tests.rs — integration tests for single-record and batch rejection
  • README.md — new "Payload Size Limit" subsection under Error Handling
  • NEXT_CHANGELOG.md — entry under New Features and Improvements

How is this tested?

cargo test --workspace

Signed-off-by: elenagaljak-db <elena.galjak@databricks.com>
Signed-off-by: elenagaljak-db <elena.galjak@databricks.com>

@teodordelibasic-db teodordelibasic-db left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I just think this should be configurable.

Comment thread rust/sdk/src/lib.rs

/// Maximum encoded byte size allowed per `ingest_record_offset` / `ingest_records_offset` call.
/// Matches the server limit so oversize payloads fail fast client-side.
const MAX_INGEST_PAYLOAD_BYTES: usize = 10 * 1024 * 1024; // 10 MiB

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should expose this as configurable for the client for forward compatibility. For example if in the future we allow larger messages then 10MB on the server and someone is using the next version of the SDK containing this change, they would have to wait for a new version that also bumps the message size limit in the SDK. Perhaps that version contains breaking changes and they don't want to upgrade. By exposing it as configurable and setting the default to the server side limit, if they increase it over the limit it's on them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[All] Introduce gRPC message size limit

2 participants