Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm currently working on a "scriptable strace" in Haskell that you can use to inject arbitrary faults via syscall modification.

https://github.com/nh2/hatrace

I've used it so far to find and deterministically reproduce situations where the Haskell compiler and build tools wrote files in non-atomic/non-durable ways that would lead to failures when the machine was hard-rebooted at the wrong time.

An example test is https://github.com/nh2/hatrace/blob/7300dbf2c/test/HatraceSp..., and the compiler test is just below.



Nice! Yeah, syscall interception may be a better way to do this than fuse.

So, the thing I want is the ability to take a log (which probably also has checkpoints in it, I dunno) captured by normal operation of a database, and then run a super long slow test that puts the (fake) filesystem into a state as of various points along that log where some not-yet-fsync'd changes are lost, on a sector-by-sector basis (most systems rely on at least 512 sector pages being atomic, so you'd probably do it at that size). Then you'd test if the database can recover successfully and pass some sanity test at all those points along the log.

People do this kind of testing by pulling the power on busy databases. This would simulate pulling the power at a huge number of times, on a maximally non-forgiving filesystem/hardware (= 512 byte sector atomicity, flushed in random order).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: