SOLR-18198 Add support for "missing" stats in rollup for streaming expressions#4286
SOLR-18198 Add support for "missing" stats in rollup for streaming expressions#4286KhushJain wants to merge 4 commits intoapache:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new missing(field) metric to SolrJ streaming expressions and ensures it can be correctly rolled up when metrics are computed in parallel/tiered contexts.
Changes:
- Introduces
MissingMetric(new metric functionmissing) for counting null/missing values. - Registers the new metric in
Langand adds rollup support inParallelMetricsRollup. - Updates streaming rollup tests and adds a changelog entry.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| solr/solrj-streaming/src/test/org/apache/solr/client/solrj/io/stream/StreamingTest.java | Extends rollup/parallel rollup tests and fixtures to validate missing(b_f) behavior. |
| solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/stream/metrics/MissingMetric.java | Adds the new MissingMetric implementation and streaming-expression parsing/toExpression support. |
| solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/stream/ParallelMetricsRollup.java | Adds rollup aggregation logic for MissingMetric (sum across tiers). |
| solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/Lang.java | Registers missing as a streaming expression metric function. |
| changelog/unreleased/SOLR-18198-support-missing-stats-count-in-rollup.yml | Documents the feature addition in the changelog. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Can you update the docs as well??? |
Yes, updated the doc. |
epugh
left a comment
There was a problem hiding this comment.
LGTM pending the checks all running to completion!
|
Thanks @epugh! Can you backport to 10.x and 9.x? |
|
Looks like one test failed? Can you investigate? |
Looks like the Lang test needed to include the new metric that was added. Let's try now @epugh |
https://issues.apache.org/jira/browse/SOLR-18198
Description
Rollupfunction to supportmissingstatistics in/streamhandler.Solution
Add a new
MissingMetricto the streaming expressions framework:MissingMetric.java: New metric class modeled afterCountMetric. Increments a counter whentuple.get(columnName) == null. Registered as function namemissing.Lang.java: Registermissingfunction name.ParallelMetricsRollup.java: Added SumMetric aggregation.Tests
Updated existing tests in
StreamingTest.java:b_fto only 5 of the 10 documents inhelloDocsUpdateRequest(ids 0, 4, 5, 6, 9), leaving the other 5 without it. This avoids changing existinga_fassertions.
testRollupStreamandtestParallelRollupStream: AddedMissingMetric("b_f")to the metrics array and asserted.Checklist
Please review the following and check all that apply:
mainbranch../gradlew check.