# Alarm

EMQX Broker has built-in monitoring and alarm functionality. Currently it supports monitoring of CPU occupancy, system and process memory occupancy, number of processes, rule engine resource status, cluster partition and healing, and it can alarm on these metrics. Both activation and deactivation of alarms will generate an alarm log and the Broker will publish an MQTT message with the topic of $SYS/brokers/<Node>/alarms/activate or $SYS/brokers/<Node>/alarms/deactivate. Users can subscribe to the topics of $SYS/brokers/+/alarms/avtivate and $SYS/brokers/+/alarms/deactivate to get alarm notifications.

The Payload of an alarm notification message is in Json format and contains the following fields:

FieldTypeDescription
namestringAlarm name
detailsobjectAlarm details
messagestringHuman-readable alarm instructions
activate_atintegerA UNIX timestamp in microseconds representing when the alarm was activated
deactivate_atinteger / stringA UNIX timestamp in microseconds representing when the alarm was deactivated. The value of this field for the activated alarm is infinity.
activatedbooleanWhether the alarm is activated

Taking the alarm of high system memory usage as an example, you will receive a message in the following format:

An alarm will not be generated repeatedly. That means if the high CPU usage alarm has been activated the same alarm will not appear again while high CPU is maintained. The alarm will be automatically deactivated when the monitored metric returns to normal. However, it also supports manual deactivation by the user (if the user clearly does not care about the alarm). Users can view current alarms (activated alarms) and historical alarms (deactivated alarms) on the Dashboard, and they can also use the HTTP API provided by EMQX to Query and manage alarms.

EMQX Broker allows users to adjust the alarm function to a certain extent to meet actual needs. The following configuration items are currently available:

Configuration itemTypeDefault valueDescription
os_mon.cpu_check_intervalduration60sCheck interval for CPU usage
os_mon.cpu_high_watermarkpercent80%The high watermark of the CPU usage, which is the threshold to activate the alarm
os_mon.cpu_low_watermarkpercent60%The low watermark of the CPU usage, which is the threshold to deactivate the alarm
os_mon.mem_check_intervalduration60sCheck interval for memory usage
os_mon.sysmem_high_watermarkpercent70%The high water mark of the system memory usage. The alarm is activated when the total memory occupied by the application reaches this value
os_mon.procmem_high_watermarkpercent5%The high water mark of the process memory usage. The alarm will be activated when the memory occupied by a single process reaches this value.
vm_mon.check_intervalduration30sCheck interval for the number of processes
vm_mon.process_high_watermarkpercent80%The high watermark of the process occupancy rate, that is, the alarm is activated when the ratio of the number of created processes to the maximum number limit reaches this value
vm_mon.process_low_watermarkpercent60%The low water mark of the process occupancy rate, that is, the alarm is deactivated when the ratio of the number of created processes to the maximum number limit drops to this value
alarm.actionsstringlog,publishThe action triggered when the alarm is activated. It currently only supports log and publish, which is to output log and publish system messages
alarm.size_limitinteger1000The maximum number of saved alarms that has been deactivated. After the limit is reached, these alarms will be cleared according to the FIFO principle
alarm.validity_periodduration24hThe maximum storage time of deactivated alarms, and expired alarms will be cleared