[lake] Avoid to generate empty split in tiering source enumerator#1925
[lake] Avoid to generate empty split in tiering source enumerator#1925luoyuxia merged 2 commits intoapache:mainapache/fluss:mainfrom LiebingYu:tiering-not-empty-splitsLiebingYu/fluss:tiering-not-empty-splitsCopy head branch name to clipboard
Conversation
941da9d to
6b6abb0
Compare
6b6abb0 to
ac4079e
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR updates the tiering split generation logic to skip empty buckets and only generate splits for buckets that contain data. Key changes include adding latestBucketOffset validation in the split generator and updating tests to reflect this new behavior.
- Added checks in
TieringSplitGeneratorto skip generating splits whenlatestBucketOffset <= 0 - Updated test cases to write data before expecting split assignments and adjusted assertions for empty buckets
- Fixed typo in variable name from
partionIdtopartitionId
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| TieringSplitGenerator.java | Added latestBucketOffset validation to skip empty buckets in split generation for both primary key and log tables |
| TieringSourceEnumeratorTest.java | Updated tests to write data before split assignment, adjusted expected split counts, fixed typo in variable name, and corrected test setup for empty bucket scenarios |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .forEach(a -> a.assignment().values().forEach(actualAssignment::addAll)); | ||
|
|
||
| assertThat(actualAssignment).isEqualTo(expectedAssignment); | ||
| // no spilt assignment for empty buckets |
There was a problem hiding this comment.
Corrected spelling of 'spilt' to 'split'.
| // no spilt assignment for empty buckets | |
| // no split assignment for empty buckets |
| 0L, | ||
| expectNumberOfSplits)); | ||
| bucketOffsetOfFirstWrite.get(partitionId).get(tableBucket), | ||
| bucketOffsetOfFirstWrite.size())); |
There was a problem hiding this comment.
The numberOfSplits parameter should be expectNumberOfSplits instead of bucketOffsetOfFirstWrite.size(). The numberOfSplits field represents 'the total number of splits in one round of tiering' (as documented in TieringSplit.java line 41), and should use the consistent value expectNumberOfSplits (which is 6) as seen in similar code at line 448, not the number of partitions.
| bucketOffsetOfFirstWrite.size())); | |
| expectNumberOfSplits)); |
luoyuxia
left a comment
There was a problem hiding this comment.
@LiebingYu Thanks for the pr. Left some comments.
| .forEach(a -> a.assignment().values().forEach(actualAssignment::addAll)); | ||
|
|
||
| assertThat(actualAssignment).isEqualTo(expectedAssignment); | ||
| // no spilt assignment for empty buckets |
Purpose
Linked issue: close #1900
Brief change log
Tests
API and Format
Documentation