Skip to main content

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow, the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Visit Stack Exchange
Asked
Modified 9 months ago
Viewed 230 times
0

I found several places suggesting to use btrfs under mdraid (mdadm) or LVM Raid. But nobody expands on how that works when you have data checksum issues!

My main concern of not using btrfs native raid, is that brtfs have native data checksum. And if I ever have a problem on any of those setup, the two bits diverging will be outside of brtfs domain (either on dmcrypt, or lvm or dm-raid, etc)... How do I go about "fixing" it?

Despite the amount of opinions to not use btrfs raid, I could not find good articles on actual alternatives. My Goal is to have any Mirror, Data integrity/checksum (btrfs or otherwise), and snapshots (for the goal of making incremental remote backups over ssh). Encryption is not required, but not a problem either. Also btrfs is also not required, i might do with ext4 after looking at the options that already cover all that before the fs.

This is about late boot data drives, not root partition.

(Also, note that LUKS+integrity can always be replaced with plain dm-integrity. Just covering 2 cases at once)

option 1:

2 HDD -> mdadm raid1 -> LUKS+integrity -> btrfs

Option 2:

2 HDD -> LUKS+integrityx2 -> mdadm raid1 -> btrfs

Option 3:

2 HDD -> LVM PVx2 -> VGx1 -> LVx1 raid1+raidintegrity -> btrfs

Option 4:

2 HDD -> LUKS+integrityx2 -> LVM PVx2 -> VGx1 -> LVx1 raid1 + raidintegrity -> btrfs

Option 5:

2 HDD -> LVM PVx2 -> VGx2 -> LVx2 linear -> mdadm raid1 -> btrfs

Option 6:

2 HDD -> MBRx2 -> partitionx2 -> dm-integrityx2 -> mdadm raid1 -> btrfs

Option 7:

2 HDD -> mdadm raid1 -> dm-integrity -> btrfs

...Many, many more combinations!

Does any of those non-btrfs raids/integrity will help me deal with btrfs checksum complaining about data integrity mismatch? Or the other way, if mdadm have a /sys/block/mdX/md/mismatch_cnt and I need to pick one of the two HDDs in a RAID1, will brtfs data help me deal with it?

(pls, don't suggest openzfs, already had too many questions closed for "Y for X questions" reasons :)

4
  • 1
    "Despite the amount of opinions to not use btrfs raid, I could not find good articles on actual alternatives" ah, some btrfs RAID levels (but not RAID1; and: at least up till recently, haven't checked lately) were simply marked as "will lose data, don't use" in the source code and did deliver what they promised; guess that's a bit stronger than "opinions" ;) BUT can't resist the urge to say you know the alternative: many btrfs concepts were directly taken from ZFS, and that's really time-proven, and solid, and fast, and does support zraid. So, I think that's your alternative right there.
    Marcus Müller
    –  Marcus Müller
    2024-12-28 17:39:07 +00:00
    Commented Dec 28, 2024 at 17:39
  • 1
    note that a dm-integrity layer allows for error correction, whereas a checksum can only tell you that something is wrong. So, if you can have "mirroring with integrity on top", that's always worse than "integrity with mirroring on top". So between options 1 and 2, go for 2.
    Marcus Müller
    –  Marcus Müller
    2024-12-28 17:41:28 +00:00
    Commented Dec 28, 2024 at 17:41
  • you always need your integrity to be below cryptography, never above, because one of the key points in this kind of cryptography is that if you change a bit in the encrypted bitstream, then the changes in the plain text are unpredictable and widespread, and vice versa. So, the cryptographical layer really needs a "reliable" bit source (or integrate the error correction itself, before doing the crypto – which is really just a double-layer with integrity below, crypto on top).
    Marcus Müller
    –  Marcus Müller
    2024-12-28 17:46:41 +00:00
    Commented Dec 28, 2024 at 17:46
  • What about option 3? i can even double the integrity layers with disk[1,2]->lvm+raid1+integrity->luks+integrity->brtfs+checksum ...dm-integrity is pretty cheap: 8Mb:1Gb
    gcb
    –  gcb
    2024-12-29 02:03:56 +00:00
    Commented Dec 29, 2024 at 2:03

1 Answer 1

1

Does any of those non-btrfs raids/integrity will help me deal with btrfs checksum complaining about data integrity mismatch?

Nope, if you have any kind of duplicity, and a layer that checks whether the read data is correct, that checking layer must be the one selecting the "effective" duplicate. Since the file system always sits "on top" in all these options, that's not possible.

So, you'll want a file system with the integrity check built into its understanding of duplicity.

The closest you get to that is some form of

disk 1 -> [error correcting layer ->] checksumming layer -\
                                                           >RAID 1 -> filesystem
disk 2 -> [error correcting layer ->] checksumming layer -/

because the checksumming layer (a role that LUKS+integrity can play) will signal a failure "upwards", forcing the RAID 1 layer to pick the other copy.

Is that a great solution? Hm. Doing the cryptography twice certainly does sound like a performance downside. (I don't actually know whether you can tell cryptsetup luksFormat that you'd like checksums, but not cryptography? Then you could do the cryptography on top of the RAID 1 layer, below the filesystem layer, should you want encryption. Yay! More Layers! More layers are always good for reliability.)

Honestly, without btrfs or ZFS stock Linux is simply lacking behind other distributed storage solutions (including these in the MS Windows world). But, I don't think that in this time and age we need to count these two options out. btrfs RAID 1 doesn't seem to be in its infacy anymore (the thing where the selection which disk to read from was purely based on whether the reading PID was even or odd are over…), and it seems to be in productive use all over the place, in non-RAID5/6 configurations, at least (it seems better btrfs RAID 6 only arrived with Linux 6.2 (Changelog)).

3
  • You suggest brtfs, but it only have checksum, not integrity. I see FB use them in production but not for data (mostly to snapshot VM images and fail early). And synology use brtfs but their doc for checksum error is literal "wipe and start over". I am fine with RAID1 tho. But with integrity+checksum. zfs being out of tree is more troublesome as we need rolling kernels to be relatively recent.
    gcb
    –  gcb
    2024-12-29 02:01:57 +00:00
    Commented Dec 29, 2024 at 2:01
  • @gcb You say btrfs has no integrity? Self-healing based on comparing data checksums is a major feature. This can happen with raid1, raid10, and DUP data and metadata profiles. "single" can flag but not repair because it only has one copy and is using checksums rather than something like a hamming code, But this is a trade off in computation, detection strength, and number of correctable bits per chunk. Considering drive errors are often 4kiB blocks which typical ECC coding cannot help. btrfs also allows selection of several checksum algorithms (speed vs strength/collision probability)
    Max Power
    –  Max Power
    2025-02-13 15:55:25 +00:00
    Commented Feb 13 at 15:55
  • @MaxPower this is about brtfs under other raid setups, so DUP which is the default for single nowadays. But the manual shows so many caveats for DUP (drive deduplication, sneaky ssd trim, lost full drive, etc, etc, etc) that is is better to assume there's no integrity feature.
    gcb
    –  gcb
    2025-02-17 17:28:39 +00:00
    Commented Feb 17 at 17:28

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Morty Proxy This is a proxified and sanitized view of the page, visit original site.