You are reading the documentation for Statusengine 3.x - Switch to Version 2.x for old stable

Broker Module

The Statusengine Broker Module is a small C binary that gets loaded into your Naemon or Nagios Core.

It will grab all status and configuration information, encode them as JSON, and put them into the Gearman Job Server.
Due to the queuing engine (Gearman) your Monitoring Core will not get blocked by an slow database or disk io issues.
It is highly recommended to run the Gearman Job Server on the same node as the monitoring core.

I would recommend you, to split your monitoring node if you reach 50.000 services. This highly depends on your hardware and check interval but just as a rough idea.
Depanding of your monitoring configuration this can be a challenging task.
For large environments I would recommand to use a config generator or something similar.
Once you have done splitting your configuration, you can deploy a new node with Naemon or Nagios , Gearman as Queue and load the Statusengine Broker Module.
The Statusengine UI comes with support for multiple nodes, so you will see all monitored devices in one interface.

Naemon:

  • Naemon > 1.0.1
Nagios:
  • Nagios > 4.0.1
Did not have a running installation yet? Check the tutorials for Naemon or Nagios

Please select your operating system first. If your operating system is not in the list, pick the operating system version that matches your operating system best. You can still install Statusengine on your system, even if it is not in the list.

  1. Load EPEL Repository and install CentOS dependencies
    yum install epel-release
    yum check-update
    
    yum group install "Development Tools"
  2. Install dependencies
    {[{commands[selectedOs].dependencies}]}
  3. Start Gearman-Job-Server on system boot
    systemctl enable gearmand
    systemctl start gearmand
  4. Download and Install Statusengine Broker Module
    cd /tmp
    git clone https://github.com/statusengine/module.git module
    cd module/
    make all
    
    mkdir -p /opt/statusengine/module
    cp src/bin/naemon/statusengine-1-0-5.o /opt/statusengine/module/statusengine-naemon-1-0-5.o
    cp src/bin/naemon/statusengine.o /opt/statusengine/module/statusengine-naemon.o
    cp src/bin/nagios/statusengine.o /opt/statusengine/module/statusengine-nagios.o
    
  5. Load the Broker Module

    For Naemon > 1.0.1 add the following to your naemon.cfg

    broker_module=/opt/statusengine/module/statusengine-naemon.o

    For Naemon >= 1.0.5 add the following to your naemon.cfg

    broker_module=/opt/statusengine/module/statusengine-naemon-1-0-5.o

    For Nagios > 4.0.1 add the following to your nagios.cfg

    broker_module=/opt/statusengine/module/statusengine-nagios.o

    Recommended Broker Options to use with Statusengine UI

    use_process_data=0 use_system_command_data=0 use_comment_data=0 use_external_command_data=0 use_flapping_data=0 use_notification_data=0 use_program_status_data=0 use_contact_status_data=0 use_contact_notification_data=0 use_event_handler_data=0 use_object_data=0 use_restart_data=1 use_service_perfdata=1

    Attention Nagios users

    If you are using a Nagios version < 4.4.0 and systemd, you should disable processing of log entries via

    use_log_data=0
    to avoid issues on restart or stop of Nagios process.

  6. Restart your Monitoring Tool
    {[{commands[selectedOs].restartMonitoring}]}

By default the Statusengine Event Broker Module will write every event data to the Gearman Job Server.
Normally there is always data you will not need for example 'statusngin_externalcommands' or 'statusngin_systemcommands'
For large environments this unnecessary data will slow down your database and increase your disk I/O, so you can disable data using the broker options.

To use one or more broker options, just add them in your monitoring config like in this example:

broker_module=/opt/statusengine/module/statusengine-naemon-1-0-5.o use_external_command_data=0 use_system_command_data=1

List of available broker options:

Option name
Affected queue
Description
Recommended to disable
Deprecated
use_host_status_data
statusngin_hoststatus
Hoststatus table
No
use_service_status_data
statusngin_servicestatus
Servicestatus table
No
use_service_check_data
statusngin_servicechecks
Will update the servicechecks table
Depends
use_host_check_data
statusngin_hostchecks
Will update the hostchecks table
Depends
use_state_change_data
statusngin_statechanges
If disabled, the table 'statehistory' for hosts and services will not get updated anymore
No
use_log_data
statusngin_logentries
Will update the table logentries
Depends
use_comment_data
statusngin_comments
Not implemented yet
Yes
use_acknowledgement_data
statusngin_acknowledgements
The message of acknowledgements will not be saved in the database anymore. Will not affect the 'problem_has_been_acknowledged' field in host-/servicestatus tables.
No
use_downtime_data
statusngin_downtimes
Downtime information will be saved in the database anymore. Will not affect the 'scheduled_downtime_depth' field in host-/servicestatus tables.
No
use_contact_notification_method_data
statusngin_contactnotificationmethod
Event that conains all notification data
No
enable_ochp
statusngin_ochp
Basically the same as statusngin_hostchecks. You can use this as alternative for the classic OCHP command
Depends
enable_ocsp
statusngin_ocsp
Basically the same as statusngin_servicechecks. You can use this as alternative for the classic OCSP command
Depends
use_restart_data
statusngin_core_restart
Tell Statusengine Worker that a restart of the Monitoring Core occurse
No!
use_service_perfdata
statusngin_service_perfdata
Used by Statusengine Worker to process performance data
Depends
gearman_server_addr
n/a
Address of the Gearman-Job-Server (default: 127.0.0.1)
 
gearman_server_port
n/a
Port of the Gearman-Job-Server (default: 4730)
 
gearman_server_list
n/a
A list of Gearman-Job-Servers separated by comma as failover servers. gearman_server_list=127.0.0.1:4730,192.168.10.5:4730
 
gearman_dup_server_list
n/a
A list of Gearman-Job-Servers separated by comma. All records will be pushed to all servers. gearman_dup_server_list=127.0.0.1:4730,192.168.10.5:4730
 
use_process_data
statusngin_processdata
Not used anymore by Statusengine 3.
Yes
Yes
use_system_command_data
statusngin_systemcommands
Not used anymore by Statusengine 3.
Yes
Yes
use_external_command_data
statusngin_externalcommands
Not used anymore by Statusengine 3.
Yes
Yes
use_flapping_data
statusngin_flappings
Not used anymore by Statusengine 3.
Yes
Yes
use_program_status_data
statusngin_programmstatus
Not used anymore by Statusengine 3.
Yes
Yes
use_notification_data
statusngin_notifications
Not used anymore by Statusengine 3.
Yes
Yes
use_contact_status_data
statusngin_contactstatus
Not used anymore by Statusengine 3.
Yes
Yes
use_contact_notification_data
statusngin_contactnotificationdata
Not used anymore by Statusengine 3.
Yes
Yes
use_event_handler_data
statusngin_eventhandler
Not used anymore by Statusengine 3.
Yes
Yes
use_object_data
statusngin_objects
Statusengine 3 don't save a dump of the monitoring configuration to the database anymore.
Yes
Yes

Only use ip addresses!

Due to a bug in the german library you should only use ip addresses to connect to a Gearman-Job-Server. Do not use host names like localhost!

The default OCSP and OCHP allows you to run a command or script after a service or host check was executed.
This will highly affect the performance of your monitoring core!
To avoid this issue, Statusengine's "OCSP"/"OCHP" will create you a special Gearman Queue and store a copy of every host check and service check event.
You can consume the events inside of the queue with a little script to get the data you need.

Data Example:
This example show all fields, you can receive via the Statusengine OCHP and OCSP queues.


{
"servicecheck": {
  "return_code": 0,
  "latency": 0,
  "execution_time": 0.016237,
  "check_type": 0,
  "perf_data": "load1=0.020;4.000;7.000;0; load5=0.030;6.000;8.000;0; load15=0.000;6.000;8.000;0;",
  "long_output": null,
  "output": "OK - load average: 0.02, 0.03, 0.00",
  "command_name": "check_by_nrpe!check_load",
  "command_line": "$USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$",
  "service_description": "CPU Load",
  "host_name": "CrateDB1-Naemon",
  "current_attempt": 1,
  "max_attempts": 3,
  "state_type": 1,
  "state": 0,
  "timeout": 60,
  "start_time": 1500727171,
  "end_time": 1500727171,
  "early_timeout": 0
},
"timestamp": 1500727171,
"attr": 0,
"flags": 0,
"type": 701
}

PHP Example Script:
The given PHP example script will fetch all jobs out of the queue and print the data to the shell.
Press STRG+C (^C) to exit.


<?php
$StatusengineOcsp = new StatusengineOcspProcessor();
$StatusengineOcsp->loop();

//Example Class
class StatusengineOcspProcessor{

    /**
     * @var \GearmanWorker
     */
    private $GearmanWorker;

    public function __construct(){
        //Create the GearmanWorker PHP Object
        $this->GearmanWorker = new \GearmanWorker();

        //Connect to the Gearman-Job-Server
        $this->GearmanWorker->addServer('127.0.0.1', 4730);

        //Consume data from the queue statusngin_ocsp and pass it
        //to the php method handleOcsp of the this class
        $this->GearmanWorker->addFunction('statusngin_ocsp', [$this, 'handleOcsp']);
    }

    public function loop(){
        //Start infinite loop to consume incoming jobs
        while(true){
            $this->GearmanWorker->work();
        }
    }

    public function handleOcsp($job){
        //Print service check data to stdout
        print_r(json_decode($job->workload()));
    }

}

Bash Example Script:

apt-get install jq
This example will print one job to the shell and exit
#!/bin/bash
gearman -w -c 1 -f statusngin_ocsp | jq .


More languages are available in the official Gearman documentation .

If you want to update to a new version of the Statusengine event broker, create a backup of your currently installed version first.

Stop your monitoring engine like: systemctl stop naemon.

Than repeat the installation steps.

To check if the Statusengine Broker Module save all events to the Gearman Job Server you can use the gearadmin --status command.
This will display you all existing queues, waiting jobs, active workers, and how much workers are connected to the queue.

#                              Jobs    Active   Worker
root@naemon:~# gearadmin --status
statusngin_contactstatus         2       0       0
statusngin_servicestatus        52       0       0
statusngin_hoststatus           10       0       0
statusngin_servicechecks         3       0       0
statusngin_ocsp                  3       0       0
statusngin_statechanges          0       0       0
statusngin_hostchecks            1       0       0
statusngin_logentries            4       0       0
In this example broker module put data into the queue but no process is consuming the provided data (0 connected workers).
So the broker modules works fine.

In addition, you can also run your monitoring core in foreground to debug issues or get more information about what's going on.

root@naemon:~# sudo -u naemon /bin/bash
naemon@naemon:~$ /opt/naemon/bin/naemon /opt/naemon/etc/naemon/naemon.cfg

Naemon Core 1.0.6-source
Copyright (c) 2013-present Naemon Core Development Team and Community Contributors
Copyright (c) 2009-2013 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
License: GPL

Website: http://www.naemon.org
Naemon 1.0.6-source starting... (PID=22687)
Local time is Tue May 16 19:57:07 CEST 2017
qh: Socket '/opt/naemon/var/naemon.qh' successfully initialized
nerd: Channel hostchecks registered successfully
nerd: Channel servicechecks registered successfully
nerd: Fully initialized and ready to rock!
wproc: Successfully registered manager as @wproc with query handler
wproc: Registry request: name=Core Worker 22689;pid=22689
wproc: Registry request: name=Core Worker 22690;pid=22690

Event broker module '/opt/statusengine/module/statusengine-naemon-1-0-5.o' initialized successfully.

Successfully launched command file worker with pid 22705