for a Raspberry Pi used in remote, automated systems watchdog timers are essential.
„A watchdog timer (sometimes called a computer operating properly or COP timer, or simply a watchdog) is an electronic timer that is used to detect and recover from computer malfunctions. During normal operation, the computer regularly resets the watchdog timer to prevent it from elapsing, or „timing out“. If, due to a hardware fault or program error, the computer fails to reset the watchdog, the timer will elapse and generate a timeout signal. The timeout signal is used to initiate corrective action or actions. The corrective actions typically include placing the computer system in a safe state and restoring normal system operation.“ (Wikipedia)
the wathdog’s kernel module bcm2835_wdt is loaded automatically, no need to add it to /etc/modules. just check the loaded module and the presence of the watchdog with following commands.
root@cats:~# lsmod | grep wd bcm2835_wdt 3225 1 root@cats:~# ls -la /dev/watchdog* crw------- 1 root root 10, 130 Sep 24 10:44 /dev/watchdog crw------- 1 root root 253, 0 Sep 24 10:44 /dev/watchdog0
let’s wake up the dog.
root@cats:~# aptitude install watchdog
due to the unit file of watchdog is missing the install section and therefore systemd will not start the watchdog at boot, we need to link default config to multi-user.target.wants.
root@cats:~# ln /lib/systemd/system/watchdog.service /etc/systemd/system/multi-user.target.wants/watchdog.service
the Raspberry’s watchdog has to be fed every 15 seconds to prevent a reboot. to avoid the warning cannot set timeout 60 (errno = 22 = ‚Invalid argument‘) set watchdog-timeout to 10 seconds. and to not get bitten by the watchdog set interval to 2 seconds.
its also worthwile to set max-load-1 to a value of 24 to prevent the Raspberry from overloading.
so, add following to /etc/watchdog.conf.
watchdog-device = /dev/watchdog watchdog-timeout = 10 interval = 2 max-load-1 = 24
after reboot check if the watchdog got woken up.
root@cats:~# service watchdog status watchdog.service - watchdog daemon Loaded: loaded (/lib/systemd/system/watchdog.service; static) Active: active (running) since Sat 2016-09-24 12:43:38 UTC; 23s ago Sep 24 12:43:38 raspberrypi watchdog[634]: int=2s realtime=yes sync=no soft=no mla=0 mem=0 Sep 24 12:43:38 raspberrypi watchdog[634]: test=none(0) repair=none(0) alive=/dev/watchdog heartbeat=none to=root no_act=no force=no Sep 24 12:43:38 raspberrypi watchdog[634]: watchdog now set to 10 seconds Sep 24 12:43:38 raspberrypi watchdog[634]: hardware watchdog identity: Broadcom BCM2835 Watchdog timer Sep 24 12:43:38 raspberrypi systemd[1]: Started watchdog daemon.
to test explode a fork-bomb.
root@cats:~# :(){ :|:& };:
or perform a system crash by a NULL pointer dereference.
root@cats:~# echo c > /proc/sysrq-trigger
now you Raspberry didn’t get more stable, but it will not get stuck. 🙂
Thanks for the great and simple post. This is the only solution that worked for me.