The Statusengine Broker Module is a small C binary that gets loaded into
your Naemon or Nagios Core.
It will grab all status and configuration information, encode them as JSON,
and put them into the Gearman Job Server.
Due to the queuing engine (Gearman) your Monitoring Core will not get blocked by an slow
database or disk io issues.
It is highly recommended to run the Gearman Job Server on the same node as the monitoring core.
I would recommend you, to split your monitoring node if you reach 50.000 services.
This highly depends on your hardware and check interval but just as a rough idea.
Depanding of your monitoring configuration this can be a challenging task.
For large environments I would recommend to use a config generator or something similar.
Once you have done splitting your configuration, you can deploy a new node with Naemon or Nagios
, Gearman as Queue and load the Statusengine Broker Module.
The Statusengine UI comes with support for multiple nodes, so you will see all monitored devices in one interface.
Naemon:
Please select your operating system first. If your operating system is not in the list, pick the operating system version that matches your operating system best. You can still install Statusengine on your system, even if it is not in the list.
yum install epel-release yum check-update yum group install "Development Tools"
yum install epel-release yum check-update yum group install "Development Tools" dnf --enablerepo=PowerTools install json-c-devel
{[{commands[selectedOs].dependencies}]}
systemctl enable gearmand systemctl start gearmand
cd /tmp git clone https://github.com/statusengine/module.git module cd module/ make all mkdir -p /opt/statusengine/module cp src/bin/naemon/statusengine-1-1-0.o /opt/statusengine/module/statusengine-naemon-1-1-0.o cp src/bin/naemon/statusengine-1-0-5.o /opt/statusengine/module/statusengine-naemon-1-0-5.o cp src/bin/naemon/statusengine.o /opt/statusengine/module/statusengine-naemon.o cp src/bin/nagios/statusengine.o /opt/statusengine/module/statusengine-nagios.o
For Naemon > 1.0.1 add the following to your naemon.cfg
broker_module=/opt/statusengine/module/statusengine-naemon.o
For Naemon >= 1.0.5 add the following to your naemon.cfg
broker_module=/opt/statusengine/module/statusengine-naemon-1-0-5.o
For Naemon >= 1.1.0 add the following to your naemon.cfg
broker_module=/opt/statusengine/module/statusengine-naemon-1-1-0.o
For Nagios > 4.0.1 add the following to your nagios.cfg
broker_module=/opt/statusengine/module/statusengine-nagios.o
use_process_data=0 use_system_command_data=0 use_comment_data=0 use_external_command_data=0 use_flapping_data=0 use_notification_data=0 use_program_status_data=0 use_contact_status_data=0 use_contact_notification_data=0 use_event_handler_data=0 use_object_data=0 use_restart_data=1 use_service_perfdata=1
If you are using a Nagios version < 4.4.0 and systemd, you should disable processing of log entries via
use_log_data=0
to avoid issues on restart or stop of Nagios process.
The Naemon developeres increased the CURRENT_NEB_API_VERSION. Please make sure to install the latest version of Statusengine Broker.
{[{commands[selectedOs].restartMonitoring}]}
By default the Statusengine Event Broker Module will write every event data to the Gearman Job Server.
Normally there is always data you will not need for example 'statusngin_externalcommands' or 'statusngin_systemcommands'
For large environments this unnecessary data will slow down your database and increase your disk I/O, so you can disable data using the broker options.
To use one or more broker options, just add them in your monitoring config like in this example:
broker_module=/opt/statusengine/module/statusengine-naemon-1-0-5.o use_external_command_data=0 use_system_command_data=1
List of available broker options:
gearman_server_list=127.0.0.1:4730,192.168.10.5:4730
gearman_dup_server_list=127.0.0.1:4730,192.168.10.5:4730
Due to a bug in the german library you should only use ip addresses to connect to a Gearman-Job-Server. Do not use host names like localhost
!
The default OCSP and OCHP allows you to run a command or script after a service or host check was executed.
This will highly affect the performance of your monitoring core!
To avoid this issue, Statusengine's "OCSP"/"OCHP" will create you a special Gearman Queue
and store a copy of every host check and service check event.
You can consume the events inside of the queue with a little script to get the data you need.
Data Example:
This example show all fields, you can receive via the Statusengine OCHP and OCSP queues.
{
"servicecheck": {
"return_code": 0,
"latency": 0,
"execution_time": 0.016237,
"check_type": 0,
"perf_data": "load1=0.020;4.000;7.000;0; load5=0.030;6.000;8.000;0; load15=0.000;6.000;8.000;0;",
"long_output": null,
"output": "OK - load average: 0.02, 0.03, 0.00",
"command_name": "check_by_nrpe!check_load",
"command_line": "$USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$",
"service_description": "CPU Load",
"host_name": "CrateDB1-Naemon",
"current_attempt": 1,
"max_attempts": 3,
"state_type": 1,
"state": 0,
"timeout": 60,
"start_time": 1500727171,
"end_time": 1500727171,
"early_timeout": 0
},
"timestamp": 1500727171,
"attr": 0,
"flags": 0,
"type": 701
}
PHP Example Script:
The given PHP example script will fetch all jobs out of the queue
and print the data to the shell.
Press STRG+C (^C) to exit.
<?php
$StatusengineOcsp = new StatusengineOcspProcessor();
$StatusengineOcsp->loop();
//Example Class
class StatusengineOcspProcessor{
/**
* @var \GearmanWorker
*/
private $GearmanWorker;
public function __construct(){
//Create the GearmanWorker PHP Object
$this->GearmanWorker = new \GearmanWorker();
//Connect to the Gearman-Job-Server
$this->GearmanWorker->addServer('127.0.0.1', 4730);
//Consume data from the queue statusngin_ocsp and pass it
//to the php method handleOcsp of the this class
$this->GearmanWorker->addFunction('statusngin_ocsp', [$this, 'handleOcsp']);
}
public function loop(){
//Start infinite loop to consume incoming jobs
while(true){
$this->GearmanWorker->work();
}
}
public function handleOcsp($job){
//Print service check data to stdout
print_r(json_decode($job->workload()));
}
}
Bash Example Script:
apt-get install jqThis example will print one job to the shell and exit
#!/bin/bash
gearman -w -c 1 -f statusngin_ocsp | jq .
More languages are available in
the official Gearman documentation
.
If you want to update to a new version of the Statusengine event broker, create a backup of your currently
installed version first.
Stop your monitoring engine like: systemctl stop naemon
.
Than repeat the installation steps.
To check if the Statusengine Broker Module save all events to the Gearman Job Server
you can use the gearadmin --status
command.
This will display you all existing queues, waiting jobs, active workers,
and how much workers are connected to the queue.
# Jobs Active Worker root@naemon:~# gearadmin --status statusngin_contactstatus 2 0 0 statusngin_servicestatus 52 0 0 statusngin_hoststatus 10 0 0 statusngin_servicechecks 3 0 0 statusngin_ocsp 3 0 0 statusngin_statechanges 0 0 0 statusngin_hostchecks 1 0 0 statusngin_logentries 4 0 0In this example broker module put data into the queue but no process is consuming the provided data (0 connected workers).
In addition, you can also run your monitoring core in foreground to debug issues or get more information about what's going on.
root@naemon:~# sudo -u naemon /bin/bash naemon@naemon:~$ /opt/naemon/bin/naemon /opt/naemon/etc/naemon/naemon.cfg Naemon Core 1.0.6-source Copyright (c) 2013-present Naemon Core Development Team and Community Contributors Copyright (c) 2009-2013 Nagios Core Development Team and Community Contributors Copyright (c) 1999-2009 Ethan Galstad License: GPL Website: http://www.naemon.org Naemon 1.0.6-source starting... (PID=22687) Local time is Tue May 16 19:57:07 CEST 2017 qh: Socket '/opt/naemon/var/naemon.qh' successfully initialized nerd: Channel hostchecks registered successfully nerd: Channel servicechecks registered successfully nerd: Fully initialized and ready to rock! wproc: Successfully registered manager as @wproc with query handler wproc: Registry request: name=Core Worker 22689;pid=22689 wproc: Registry request: name=Core Worker 22690;pid=22690 Event broker module '/opt/statusengine/module/statusengine-naemon-1-0-5.o' initialized successfully. Successfully launched command file worker with pid 22705