Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
torvalds
GitHub Repository: torvalds/linux
Path: blob/master/Documentation/ABI/removed/sysfs-mce
32114 views
What:		/sys/devices/system/machinecheck/machinecheckX/tolerant
Contact:	Borislav Petkov <[email protected]>
Date:		Dec, 2021
Description:
		Unused and obsolete after the advent of recoverable machine
		checks (see last sentence below) and those are present since
		2010 (Nehalem).

		Original description:

		The entries appear for each CPU, but they are truly shared
		between all CPUs.

		Tolerance level. When a machine check exception occurs for a
		non corrected machine check the kernel can take different
		actions.

		Since machine check exceptions can happen any time it is
		sometimes risky for the kernel to kill a process because it
		defies normal kernel locking rules. The tolerance level
		configures how hard the kernel tries to recover even at some
		risk of	deadlock. Higher tolerant values trade potentially
		better uptime with the risk of a crash or even corruption
		(for tolerant >= 3).

		==  ===========================================================
		 0  always panic on uncorrected errors, log corrected errors
		 1  panic or SIGBUS on uncorrected errors, log corrected errors
		 2  SIGBUS or log uncorrected errors, log corrected errors
		 3  never panic or SIGBUS, log all errors (for testing only)
		==  ===========================================================

		Default: 1

		Note this only makes a difference if the CPU allows recovery
		from a machine check exception. Current x86 CPUs generally
		do not.