Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲Corrupting a ZFS File on Purpose (oshogbo.com)

64 points by zdw 3 days ago | 11 comments

ralferoo 2 days ago [-]

Hmmm, it's been a long long time since I actually had a failed drive (and also I don't use zfs), but from what I remember of my last failing drive 20 years ago, the drive was able to detect that sectors had been corrupted, and then failed the read rather than just returning silently corrupted data. If my memory is correct, replacing random bytes on disk wouldn't actually reflect the typical way data corruption manifests itself.

I always thought that the reason zfs did its extensive CRC checks was primarily to detect data corruption while it was in RAM or over the network, with a side effect that in the rare cares that data on disk got corrupted without the drive detecting it because the CRC was still valid, it'd also be spotted.

But anyway, it might be worth testing by replacing some of the disk images with actually truncated ones so that there are holes when reading, so that it returns an actual read error rather than junk data.

adrian_b 2 days ago [-]

The error-correcting codes used by HDDs/SSDs correct or detect the most frequent errors, but sometimes, when there are too many erroneous bits in a sector, they can mis-correct the data and then the HDD/SSD returns a corrupted sector without signaling any error.

I have seen this a few times on HDDs that had been used for the cold storage of archival data, for several years (around 5 years or even more). For each archive file, I had my own hash values that were used to detect corrupted files, which allowed me to detect all such cases. I had duplicates for all such HDDs. Sometimes both HDD copies had a few silent corrupted sectors, but they were not in the same locations, so in all cases I could recover the corrupted files from their duplicates. If I had stored the archival data without redundancy, I would have lost it.

If you do not use hashes or other error-detecting codes for all your files, like I do, you may have had some failures in your HDDs without recognizing them, but such errors are much more likely to happen in files that have been stored for many years.

ramses0 2 days ago [-]

And/Or: `*.par` files.

https://en.wikipedia.org/wiki/Parchive

adrian_b 9 minutes ago [-]

Yes, already for many years, I have also used par2create/par2verify for adding redundancy to archive files and repairing any corrupted files.

However, I use both par2create and duplicate storage media, because duplicates that are preferably stored in different geographic locations are the only solution that guards against incidents so serious that they would destroy partially or totally the storage device.

By itself, when an adequate amount of added redundancy is chosen, par2create is sufficient to recover archive files that are only affected by a few sporadic corrupted sectors, like on a HDD that has been stored in good conditions for some years. It will not help if the entire HDD becomes unusable, due to some mechanical or electrical defect, which may happen in HDDs used for cold storage, instead of being used continuously.

wongarsu 2 hours ago [-]

Or rar files with recovery records. Same concept, but in one self-contained file instead of a number of sidecar files

throw0101c 3 hours ago [-]

> I always thought that the reason zfs did its extensive CRC checks was primarily to detect data corruption while it was in RAM or over the network, with a side effect that in the rare cares that data on disk got corrupted without the drive detecting it because the CRC was still valid, it'd also be spotted.

Nope, it's always been about on-disk bit rot.

First off: drive firmware has been known to return the wrong LBA data. The OS asks for 123, the drive reads 234—and verifies its drive-level CRC, which passes—and sends it up. Application gets a bundle of bits that's not correct. With ZFS, it expects a certain checksum from that part of the tree/file, and so the LBA 234 gets returned it will not match the checksum that is for 123.

Next, if you have RAID-1, then if the drive has corrupted data, if you don't have higher-level FS checksums, how do you which mirror has the correct data? They're different, but which is correct. With ZFS you know which block has the correct checksum, return that data to application, and then use the correct data to correct the wrong one.

matja 2 days ago [-]

You're right that the ECC validation is very robust, but that only validates one small part - that the drive is reading what it has previously written, not that the data was correct when it came in to the drive, correctly handled by the firmware, or even written in the correct place (LBA) on the drive.

There's been times when some features of entire models of drives have been disabled in the Linux kernel because of buggy firmware that silently writes bad data (with correct ECC), so reading it back is successful from both the drive's and the OS's block driver views.

I was hit by this myself with the queued TRIM command firmware bug that affected all Samsung EVO 840 SSDs (Linux kernel commit 9a9324d3969678d44b330e1230ad2c8ae67acf81 if you want to look into the history) - the drive didn't report any errors, but ZFS kept reporting corruption, and kept on fixing it in the background.

anonymous_user9 2 days ago [-]

> The DVA was correct, the sector math was correct, the dd command was correct. The right place, the wrong mental model.

God the intensity is tiresome. Whether or not it's AI slop, it's also bad writing. Things can be fun or interesting or worthwhile without being a harrowing battle of discovery!

calcifer 47 minutes ago [-]

> Things can be fun or interesting or worthwhile without being a harrowing battle of discovery!

The quoted sentences used "correct", "right" and "wrong". Hardly the sensationalist words you're implying.

lanycrost 6 hours ago [-]

I miss ZFS, only had a chance once to work with it in production and liked it very much. It's have performance overhead compared to journal filesystems but greatly designed.

igtztorrero 2 hours ago [-]

I always run my servers on zfs pool mirrored using raid1 on 2 nvme drives, because when nvme fails, fail completely. How can a File be corrupted on normal operations?

Rendered at 17:06:45 GMT+0000 (Coordinated Universal Time) with Vercel.