Skip to content

[core] DataEvolution table supports concurrent updates to different columns#7867

Open
steFaiz wants to merge 8 commits into
apache:masterfrom
steFaiz:de_conflict_improve
Open

[core] DataEvolution table supports concurrent updates to different columns#7867
steFaiz wants to merge 8 commits into
apache:masterfrom
steFaiz:de_conflict_improve

Conversation

@steFaiz
Copy link
Copy Markdown
Contributor

@steFaiz steFaiz commented May 15, 2026

Purpose

Currently, if we concurrently trigger columns update jobs for data evolution table, the later jobs will be failed:
image

This happens even we are updating different columns.
However, we could only forbid updating the same columns. This PR introduces a RowIdColumnConflictChecker and use binary search to accelerate check process.

BenchMark

I tested 100000 files with disjoint row range check against 100000 files with disjoint row range, each file contains 3 columns, the result is:

Iteration Check Files Expected Merged Ranges Check Time
0 100,000 100,000 11 ms
1 100,000 100,000 8 ms
2 100,000 100,000 8 ms
Best 100,000 100,000 8 ms
Average 100,000 100,000 9 ms

with binary search, the check cost is negligible.

Tests

UT Cases for Spark and Unit Tests for core.

@steFaiz steFaiz marked this pull request as draft May 15, 2026 08:49
@steFaiz steFaiz changed the title [core] DataEvolution table supports concurrent updates to different columns [wip][core] DataEvolution table supports concurrent updates to different columns May 15, 2026
@steFaiz steFaiz marked this pull request as ready for review May 15, 2026 13:20
@steFaiz steFaiz changed the title [wip][core] DataEvolution table supports concurrent updates to different columns [core] DataEvolution table supports concurrent updates to different columns May 15, 2026
@steFaiz steFaiz closed this May 18, 2026
@steFaiz steFaiz reopened this May 18, 2026
@steFaiz steFaiz closed this May 18, 2026
@steFaiz steFaiz reopened this May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant