Cacti Master Poller Redundancy setup

Hello everyone,

I get asked a lot about how to make a redundant cacti setup specifically how to make the master poller redundant.

I thought I would do this writeup on how I created a redundant master poller using Mariadb replication and keepalived.

this is a simple setup but can become complicated if you want more out of it for example this setup uses MariaDB master to slave replication to replicate over the cacti DB and its changes from the primary server over to the secondary. However, changes from secondary won’t replicate to the master if you are looking for something like this then you need to look into MariaDB Master to Master replication.

 

but for most setups, this setup should work well

 

Here is the general layout we use keepalived to make sure the application is accessible in case of a failure keepalived uses VRRP ( Virtual Router Redundancy Protocol) to advertise a virtual IP from either the primary server or the secondary.  Both servers as you see from the diagram have their own IP but on top of that have a virtual IP they will use depending on their state. The user will connect to the virtual IP instead of the server IP so in the case of a failure, the user would not know they have switched servers. We use MariaDB replication to ensure that the databases are the same on both servers.

 

A layout of a redundant cacti central poller
A layout of a redundant cacti central poller

Now for the setup

I will assume you have already installed cacti on each server you can if you are using virtual machines even use a clone of a single master poller instance. I would recommend Cacti 1.2.16 since there is a new system service you will see why soon.

If you don’t already have cacti Install consider using my cacti setup wizard script https://github.com/bmfmancini/cacti-install-wizard

Keepalived setup on the master server

First, we will install and configure keepalived.

sudo apt-get install keepalived

 

After keepalived has been installed we would need to make an initial configuration for it.

touch /etc/keepalived/keepalived.conf

I use the following config to make server A the master server.

vrrp_instance VRRP1 {
notify /opt/cacti_service.sh
state MASTER
#   Specify the network interface to which the virtual address is assigned
interface enp0s3
#   The virtual router ID must be unique to each VRRP instance that you define
virtual_router_id 41
#   Set the value of priority higher on the master server than on a backup server
priority 200
advert_int 1
authentication {
auth_type PASS
auth_pass cacti
}
virtual_ipaddress {
192.168.1.22/24
}
}

In the above configuration what we are saying to keepalived is that this is the master server we ensure that the priority on the master server is higher than on the secondary server. This tells keepalived to broadcast the virtual ip address all traffic is now pointed to this server we also ensure to put a password on the VRRP instanced so not just any server can join our VRRP cluster.

Next we need to enable and start the keepalived service.

systemctl enable keepalived
systemctl start keepalived
systemctl status keepalived
[email protected]:/home/sean# systemctl status keepalived
● keepalived.service - Keepalive Daemon (LVS and VRRP)
Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2020-12-22 11:21:24 EST; 4min 10s ago
Main PID: 15706 (keepalived)
Tasks: 2 (limit: 1147)
Memory: 1.9M
CGroup: /system.slice/keepalived.service
├─15706 /usr/sbin/keepalived --dont-fork
└─15707 /usr/sbin/keepalived --dont-fork
Dec 22 11:21:24 cacti-1 Keepalived_vrrp[15707]: Registering Kernel netlink command channel
Dec 22 11:21:24 cacti-1 Keepalived_vrrp[15707]: Opening file '/etc/keepalived/keepalived.conf'.
Dec 22 11:21:24 cacti-1 Keepalived_vrrp[15707]: WARNING - script '/opt/hello.sh' is not executable for uid:gid 1001:100 - disabling.
Dec 22 11:21:24 cacti-1 Keepalived_vrrp[15707]: SECURITY VIOLATION - scripts are being executed but script_security not enabled.
Dec 22 11:21:24 cacti-1 Keepalived_vrrp[15707]: Registering gratuitous ARP shared channel
Dec 22 11:21:24 cacti-1 Keepalived_vrrp[15707]: (VRRP1) Entering BACKUP STATE (init)
Dec 22 11:21:24 cacti-1 Keepalived_vrrp[15707]: (VRRP1) received lower priority (100) advert from 192.168.1.9 - discarding
Dec 22 11:21:25 cacti-1 Keepalived_vrrp[15707]: (VRRP1) received lower priority (100) advert from 192.168.1.9 - discarding
Dec 22 11:21:26 cacti-1 Keepalived_vrrp[15707]: (VRRP1) received lower priority (100) advert from 192.168.1.9 - discarding
Dec 22 11:21:27 cacti-1 Keepalived_vrrp[15707]: (VRRP1) Entering MASTER STATE

In the status output we should ensure that this server is showing in master state if you see something similar to the above then we are good to move on to the MariaDB config.

MariaDB config on the Master

Ok so in this part we need to configure MariaDB to be the master server we will need to modify the 50-server.conf file and put a few entries and also create a replication username and password to allow replication. From the secondary server since you already have cacti installed I assume you already have either MariaDB or MYSQL installed both of which this will work on.

First edit the 50-server.conf like so

nano /etc/mysql/mariadb.conf.d/50-server.cnf

Add the following lines

[mysqld]
log-bin=/var/log/mysql/binary/mysql_binary_log
binlog-do-db=cacti
server-id=1
Save and restart the mariadb service
systemctl restart mysql

NOTE you may need to create the following path

/var/log/mysql/binary/mysql_binary_log

Next we will enter the mysql shell and create the replication username and password.

CREATE USER 'cactirep'@'192.168.1.9' IDENTIFIED BY 'rep';
GRANT REPLICATION SLAVE ON *.* TO 'cactirep'@'192.168.1.9'
Next lets check the master status
show master status\G;
*************************** 1. row ***************************
File: mysql_binary_log.000034
Position: 83158330
Binlog_Do_DB: cacti
Binlog_Ignore_DB:
1 row in set (0.000 sec)
Keep a note of the Position number and binary file we will need it laster

Now we can move onto the secondary server

Keepalived configuration on secondary server

As we did with the primary server we will install and configure keepalived this is the config I used in my setup it is essentially the same as the master but we put the state to backup and priority to 100 instead of 200

global_defs {
script_user root
enable_script_security
}
vrrp_instance VRRP1 {
state BACKUP
interface enp0s3
virtual_router_id 41
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass cacti
}
virtual_ipaddress {
192.168.1.22
}
notify /opt/cacti_service.sh
}

Keepalived notification script

You saw that on the keepalived configuration there was a script called cacti-service.sh under notify what this script does is shutdown the cacti poller if you are using pre 1.2.16 cacti then it will shutdown cron. If you are using 1.2.16 you can have it shutdown the cacti service the issue of course with shuting down cron is other scheduled tasks are gone. I suppose you could delete file cron entire from cron.d and have the script re-create it should it enter master status.

In this case when the server enters backup mode the cron is stopped this prevents double polling and also prevents the servers writing double data to the RRD files if you copy and paste this code ensure to chmod 755 it

#!/bin/bash
TYPE=$1
NAME=$2
STATE=$3
case $STATE in
"MASTER") systemctl start cron
;;
"BACKUP") systemctl stop cron
;;
"FAULT")  systemctl stop cron
exit 0
;;
*)        echo  "ahhhh" >> /opt/hello.log
exit 1
;;
esac

Mariadb setup on the secondary server

Next we will need to make a couple of changes to the 50-server file on the secondary server as well as make some shell changes.

place the following changes in the file and save
[mysqld]
server-id=2
replicate-do-db=cacti
replicate-ignore-table=cacti.poller_reindex
replicate-ignore-table=cacti.poller_output
replicate-ignore-table=cacti.poller_time

In this case, we are excluding a few transient tables to be replicated from the primary database to the secondary database these tables hold temporary data and there is no need to replicate them. Doing so could also give you some grief.

Next enter the MySQL shell and use the following commands.

CHANGE MASTER TO MASTER_HOST=’192.168.1.8′, MASTER_USER=’cactirep’,
MASTER_PASSWORD=’rep’, MASTER_LOG_FILE=’mysql_binary_log.000034′,
MASTER_LOG_POS=278790;

Remember to match the binary log file and the master position
from the command output, we got from the master server.

Now, let’s check the salve status we are looking for a status that the slave is waiting for the master to send events.

MariaDB [(none)]> show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.8
Master_User: cactirep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql_binary_log.000034
Read_Master_Log_Pos: 83598169
Relay_Log_File: mysqld-relay-bin.000002
Relay_Log_Pos: 83319941
Relay_Master_Log_File: mysql_binary_log.000034
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB: cacti
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table: cacti.poller_time,cacti.poller_reindex,cacti.poller_output
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 83598169
Relay_Log_Space: 83320251
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: conservative
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Slave_DDL_Groups: 0
Slave_Non_Transactional_Groups: 24347
Slave_Transactional_Groups: 193714
1 row in set (0.001 sec)

Success we are now seeing the slave talk to the master and waiting for an event to happen this is usually something like a new device or settings change in the cacti GUI.

Next you will do some verification and testing go to the primary server create a test device you should see that change reflected on the secondary server check out my video below to see this in action

Next you will need to setup some sort of shared storage i.e NFS share and mount it to /var/www/html/cacti/rra on both servers this will allow both servers to write data to the RRD files.

The shared storage could be like I said before NFS but could also be any other storage technology.

 

Let me know if you run into any issue with this tutorial !

 

Liked it? Take a second to support Sean Mancini on Patreon!

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.