You are reading the documentation for Statusengine 3.x - Switch to Version 2.x for old stable

Gearman Job Server - Too many open files

Gearman Job Server - Too many open files

If you are monitoring large environments you may be are using Mod_Gearman to spread the execution of checks using multiple worker nodes.

However, unfortunately I had the problem that the Gearman Job Server eats 100% of my CPU time if more than 450 workers tried to connect to it and gearadmin --status stuck and did not return information anymore.

In the log file /var/log/gearman-job-server/gearman.log I found the following message:

ERROR 2015-04-14 22:02:54.000000 [ main ] accept(Too many open files) -> libgearman-server/gearmand.cc:788

By default the Linux kernel set a limit of 1024 open files which is bad for MySQL servers or the Gearman Job Server.

To fix this issue, you need to increase this limit.

Set the limit

SysVinit (/etc/init.d/gearman-job-server start)

Edit the file /etc/init.d/gearman-job-server like this:

# Description:       Enable gearman job server
### END INIT INFO

ulimit -n 16384    # <--- Add this line

prefix=/usr
exec_prefix=${prefix}

And restart: /etc/init.d/gearman-job-server restart

Upstart (service gearman-job-server start)

Edit the file /etc/init/gearman-job-server.conf like this:

respawn

limit nofile 16384 16384 # <--- Add this line
exec start-stop-daemon --start --chuid gearman --exec ...

And restart: service gearman-job-server restart

systemd (systemctl start gearman-job-server)

Edit the file /etc/systemd/system/multi-user.target.wants/gearman-job-server.service like this:

PIDFile=/run/gearman/server.pid

LimitNOFILE=16384

ExecStart=/usr/sbin/gearmand --listen=127.0.0.1 ...

And restart:

systemctl daemon-reload
systemctl restart gearman-job-server

Check your settings

To make sure that the new limit is enabled, you should check the file /proc/$PID$/limits:

root@ubuntu-dev:~# cat /proc/2945/limits | grep -i 'max open files'
Max open files            16384                16384                files


If you are interested in, this is a screenshot showing the system:

gearadmin --status with > 815 workers