Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[Enhancement] Add commitlog offset validation in recoverAbnormally to handle old file data #10017

Copy link
Copy link
@guyinyou

Description

@guyinyou
Issue body actions

Before Creating the Enhancement Request

  • I have confirmed that this should be classified as an enhancement rather than a bug/feature.

Summary

Add commitlog offset validation check in the recoverAbnormally method to detect and truncate old file data that may pass CRC and magic code checks but contains invalid commitlog offsets. This enhancement improves the robustness of abnormal recovery by preventing the processing of stale data from reused commitlog files.

Motivation

When commitlog files are reused or overwritten, there's a small probability that old data from previous writes can pass CRC and magic code validation checks. However, these old messages contain commitlog offsets that were valid when originally written but are now outside the valid range of the current mapped file.

Without offset validation, the recovery process may incorrectly process these old messages, leading to:

  • Potential data inconsistency
  • Incorrect message dispatching
  • Unpredictable behavior during recovery

This enhancement adds an additional safety check by validating that the commitlog offset in each message falls within the valid range of the current mapped file ([fileFromOffset, fileFromOffset + fileSize]). If the offset is outside this range, the recovery process correctly identifies it as old file data and truncates at that point, enhancing the overall robustness of the abnormal recovery mechanism.

Benefits:

  • Prevents processing of stale data during recovery
  • Improves recovery reliability and correctness
  • Adds an extra layer of validation beyond CRC checks
  • Better handles edge cases when commitlog files are reused

Describe the Solution You'd Like

The enhancement involves adding a commitlog offset validation check in the recoverAbnormally method before processing messages that pass CRC and magic code checks.

Implementation Details:

  1. Add offset validation check: Before checking if dispatchRequest.isSuccess(), validate that dispatchRequest.getCommitLogOffset() falls within the valid range of the current mappedFile:

    • Check: commitLogOffset >= mappedFile.getFileFromOffset() && commitLogOffset <= mappedFile.getFileFromOffset() + mappedFile.getFileSize()
  2. Handle invalid offsets: If the offset is outside the valid range:

    • Log a warning with detailed information (offset, filename, current position)
    • Log recovery end information
    • Break the recovery loop to truncate at that point
  3. Improve logging: Use proper SLF4J placeholders in log messages for better formatting and include relevant context information.

Code Location:

  • File: store/src/main/java/org/apache/rocketmq/store/CommitLog.java
  • Method: recoverAbnormally(long maxPhyOffsetOfConsumeQueue)
  • Location: In the message recovery loop, before the dispatchRequest.isSuccess() check

Example Implementation:
if (dispatchRequest.getCommitLogOffset() < mappedFile.getFileFromOffset()
|| dispatchRequest.getCommitLogOffset() > mappedFile.getFileFromOffset() + mappedFile.getFileSize()) {
log.warn("found illegal commitlog offset {} in {}, current pos={}, it will be truncated.",
dispatchRequest.getCommitLogOffset(), mappedFile.getFileName(), processOffset + mappedFileOffset);
log.info("recover physics file end, {} pos={}", mappedFile.getFileName(), byteBuffer.position());
break;
} else if (dispatchRequest.isSuccess()) {
// Continue with normal processing...
}

Describe Alternatives You've Considered

none

Additional Context

No response

Reactions are currently unavailable

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.