[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Discussion of system crash behaviours
From: |
Yanyan Jiang |
Subject: |
Discussion of system crash behaviours |
Date: |
Mon, 12 Oct 2015 17:10:45 -0400 |
I am currently working on the file system reliability issues. I have a disk
driver that is able to simulate crash disk sites after injected power failures
(inspired by two OSDI'14 papers about crash sites, and they found interesting
bugs in many production systems like database). This disk is compatible with
the Linux block driver semantics (refer to
https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt),
and may create many crash sites that pending blocks are partially flushed into
the disk.
Our tool finds that a typical compiler (e.g., gcc) may suffer the issue of
crash inconsistency. Specifically, there is a chance that for the binary output
file (e.g., a .o file):
1. its timestamp is updated and gmake considers this file is up-to-date.
2. its actual data is not persisted to the disk.
On an ext4 filesystem (default setting) of a typical Linux distribution, we
observed that there is a chance of leaving a 0-byte output file whose timestamp
is updated. In more relaxed settings (e.g., old-time filesystems), a system
crash would leave partially corrupted file in the filesystem with timestamp
updated (e.g., several blocks are missing but with a correct header).
Note that this is NOT a defect for gcc or gmake as they have nothing to do with
the crash semantics. However, if the user continues the incremental build after
system crash, the entire thing would proceed, gmake will consider the generated
.o file is up-to-date and proceed into the next stages, finally leading to
incorrect outputs.
Though it is not a software defect, and is expected to be very rarely in
practice. Neverthless, gmake is supposed to be general and to run on any
platform. I am wondering if we should make users aware of this phenomenon
(e.g., adding a section in the document).
Regards,
Yanyan Jiang 蒋炎岩
Institute of Computer Software,
Dept. of Computer Science, Nanjing University
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Discussion of system crash behaviours,
Yanyan Jiang <=