Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[lake] Avoid to generate empty split in tiering source enumerator#1925

Merged
luoyuxia merged 2 commits intoapache:mainapache/fluss:mainfrom
LiebingYu:tiering-not-empty-splitsLiebingYu/fluss:tiering-not-empty-splitsCopy head branch name to clipboard
Nov 5, 2025
Merged

[lake] Avoid to generate empty split in tiering source enumerator#1925
luoyuxia merged 2 commits intoapache:mainapache/fluss:mainfrom
LiebingYu:tiering-not-empty-splitsLiebingYu/fluss:tiering-not-empty-splitsCopy head branch name to clipboard

Conversation

@LiebingYu
Copy link
Copy Markdown
Contributor

Purpose

Linked issue: close #1900

Brief change log

Tests

API and Format

Documentation

@LiebingYu LiebingYu force-pushed the tiering-not-empty-splits branch from 941da9d to 6b6abb0 Compare November 3, 2025 06:41
@LiebingYu LiebingYu force-pushed the tiering-not-empty-splits branch from 6b6abb0 to ac4079e Compare November 3, 2025 07:59
@luoyuxia luoyuxia requested review from Copilot and luoyuxia November 5, 2025 06:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the tiering split generation logic to skip empty buckets and only generate splits for buckets that contain data. Key changes include adding latestBucketOffset validation in the split generator and updating tests to reflect this new behavior.

  • Added checks in TieringSplitGenerator to skip generating splits when latestBucketOffset <= 0
  • Updated test cases to write data before expecting split assignments and adjusted assertions for empty buckets
  • Fixed typo in variable name from partionId to partitionId

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
TieringSplitGenerator.java Added latestBucketOffset validation to skip empty buckets in split generation for both primary key and log tables
TieringSourceEnumeratorTest.java Updated tests to write data before split assignment, adjusted expected split counts, fixed typo in variable name, and corrected test setup for empty bucket scenarios

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.forEach(a -> a.assignment().values().forEach(actualAssignment::addAll));

assertThat(actualAssignment).isEqualTo(expectedAssignment);
// no spilt assignment for empty buckets
Copy link

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'spilt' to 'split'.

Suggested change
// no spilt assignment for empty buckets
// no split assignment for empty buckets

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

0L,
expectNumberOfSplits));
bucketOffsetOfFirstWrite.get(partitionId).get(tableBucket),
bucketOffsetOfFirstWrite.size()));
Copy link

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numberOfSplits parameter should be expectNumberOfSplits instead of bucketOffsetOfFirstWrite.size(). The numberOfSplits field represents 'the total number of splits in one round of tiering' (as documented in TieringSplit.java line 41), and should use the consistent value expectNumberOfSplits (which is 6) as seen in similar code at line 448, not the number of partitions.

Suggested change
bucketOffsetOfFirstWrite.size()));
expectNumberOfSplits));

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LiebingYu Thanks for the pr. Left some comments.

.forEach(a -> a.assignment().values().forEach(actualAssignment::addAll));

assertThat(actualAssignment).isEqualTo(expectedAssignment);
// no spilt assignment for empty buckets
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@LiebingYu LiebingYu requested a review from luoyuxia November 5, 2025 09:34
Copy link
Copy Markdown
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@luoyuxia luoyuxia merged commit fc60308 into apache:main Nov 5, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Avoid to generate empty split in source emuerator

3 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.