File crash consistency and filesystems are hard
I haven't used a desktop email client in years. None of them could handle the volume of email I get without at least occasionally corrupting my mailbox. Pine, Eudora, and outlook have all corrupted my inbox, forcing me to restore from backup. How is it that desktop mail clients are less reliable than gmail, even though my gmail account not only handles more email than I ever had on desktop clients, but also allows simultaneous access from multiple locations across the globe? Distributed systems have an unfair advantage, in that they can be robust against total disk failure in a way that desktop clients can't, but none of the file corruption issues I've had have been from total disk failure. Why has my experience with desktop applications been so bad?
Well, what sort of failures can occur? Crash consistency (maintaining consistent state even if there's a crash) is probably the easiest property to consider, since we can assume that everything, from the filesystem to the disk, works correctly; let's consider that first.
Crash Consistency
Pillai et al. had a paper and presentation at OSDI '14 on exactly how hard it is to save data without corruption or data loss.
Let's look at a simple example of what it takes to save data in a way that's robust against a crash. Say we have a file that contains the text a foo
and we want to update the file to contain a bar
. The pwrite function looks like it's designed for this exact thing. It takes a file descriptor, what we want to write, a length, and an offset. So we might try
pwrite([file], "bar", 3, 2) // write 3 bytes at offset 2
What happens? If nothing goes wrong, the file will contain a bar
, but if there's a crash during the write, we could get a boo
, a far
, or any other combination. Note that you may want to consider this an example over sectors or blocks and not chars/bytes.
If we want atomicity (so we either end up with a foo
or a bar
but nothing in between) one standard technique is to make a copy of the data we're about to change in an undo log file, modify the "real" file, and then delete the log file. If a crash happens, we can recover from the log. We might write something like
Read full article from File crash consistency and filesystems are hard
In my opinion, it's not enough to detecting the disconnection just sending ping.
For example, one client was shut down suddenly, without sending closeframe or closing the connection, the server could still send message to this disconnected connection without any errors due to the tcp retransmission. Drop all outgoing packages from client side using iptables can simulate this scenario:
So it's necessary to check whether the corresponding pong was received within a certain amount of time. But this would bring more pressure to server when there are plenty of clients.