feat: add tokenizer module with sync and async support#2012
feat: add tokenizer module with sync and async support#2012
Conversation
amourao
commented
Apr 14, 2026
- Global tokenizer and per-property
- Tests
There was a problem hiding this comment.
Orca Security Scan Summary
| Status | Check | Issues by priority | |
|---|---|---|---|
| Secrets | View in Orca |
There was a problem hiding this comment.
Pull request overview
Adds a new tokenize client module (sync + async) to call Weaviate’s tokenization endpoints and returns a typed TokenizeResult, with integration tests covering serialization/deserialization and both client variants.
Changes:
- Introduce
weaviate.tokenizemodule with shared executor logic plus sync/async wrappers. - Add
TokenizeResultreturn type and response parsing for analyzer/stopword configs. - Wire
client.tokenizeintoWeaviateClient/WeaviateAsyncClientand add integration tests.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| weaviate/tokenize/types.py | Defines TokenizeResult dataclass returned by tokenization calls. |
| weaviate/tokenize/executor.py | Implements /tokenize and property tokenization requests + response parsing. |
| weaviate/tokenize/sync.py | Sync wrapper for the tokenize executor via @executor.wrap("sync"). |
| weaviate/tokenize/async_.py | Async wrapper for the tokenize executor via @executor.wrap("async"). |
| weaviate/tokenize/init.py | Exposes tokenize module symbols. |
| weaviate/client.py | Adds tokenize namespace to both sync and async clients. |
| weaviate/client.pyi | Updates type stubs to include tokenize attributes on clients. |
| weaviate/init.py | Exposes the tokenize module at the package root. |
| integration/test_tokenize.py | Integration coverage for sync/async tokenize calls and config handling. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…zeResult and related tests
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## dev/1.37 #2012 +/- ##
===========================================
Coverage ? 86.65%
===========================================
Files ? 293
Lines ? 22563
Branches ? 0
===========================================
Hits ? 19552
Misses ? 3011
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|