Papers in Systems Discussion Series: Metastable Failures

"Metastable Failures in the Wild"

"Metastable failures are a class of system failures characterized by sustaining effects that keep systems in a degraded state and resist recovery."

"A key property of metastable failures is that their root cause is not a specific hardware failure or a software bug. It is an emergent behavior of a system, and it naturally arises from the optimizations for the common case that lead to sustained work amplification. As such, metastable failures are hard to predict, may potentially have catastrophic effects, and incur significant ongoing human engineering costs because automated recovery is difficult"


Our Papers in Systems discussion series will continue with discussion of “Metastable Failures in the Wild," by Lexiang Huanget al., 2022.

The discussion will be led by Juno Suárez (@juno@hachyderm.io on mastodon)

When: November 13th, 2023, 1PM - 2PM Eastern Time (US/Canada). The Zoom room will remain open until 2:30PM for informal discussion. Check time in your time zone.

The paper is available at: https://www.usenix.org/conference/osdi22/presentation/huang-lexiang

Tickets

Schedule

November 13th, 2023

1:00pm – 2:30pm EST
Discussion: MetastableFailures

Additional Information