Maintenance procedures

Database maintenance

Rudder uses two backends to store information as of now: LDAP and SQL

To achieve this, OpenLDAP and PostgreSQL are installed with Rudder.

However, like every database, they require a small amount of maintenance to keep operating well. Thus, this chapter will introduce you to the basic maintenance procedure you might want to know about these particular database implementations.

Automatic PostgreSQL table maintenance

Rudder uses an automatic mechanism to automate the archival and pruning of the reports database.

By default, this system will:

Remove reports older than 4 days
Delete all historized compliance levels after 15 days
Delete logs older than twice the maximum agent run interval

Logs are extra information on agent runs, that are used for debugging purpose, and are not used in compliance status. It thus reduces the work overhead by only making Rudder handle relevant reports (fresh enough) and putting aside old ones.

This is configurable in /opt/rudder/etc/rudder-web.properties, by changing the following configuration elements:

rudder.batch.reportscleaner.archive.TTL: Set the maximum report age before archival (in days)
rudder.batch.reportscleaner.delete.TTL: Set the maximum report age before deletion (in days)
rudder.batch.reportscleaner.compliancelevels.delete.TTL : Set the maximum compliance age before removal (in days)
rudder.batch.reportsCleaner.deleteLogReport.TTL : Set the maximum retention of logs reports, (in runs number, using Nx notation, e.g. 2x for two runs, or in minutes)

The default values are OK for systems under moderate load, and should be adjusted in case of excessive database bloating (see next section).

The estimated disk space consumption, with a 5 minutes agent run frequency, is 500 to 900 kB per Directive, per day and per node, plus 150 kB per Directive per node per day for archived reports, plus 150 kB per Directive per node per day for compliance level, which equate to is roughly 5 to 7 MB per Directive per two weeks and per node.

Thus, 25 directives on 100 nodes, with the default reports retention policy, would take 4 to 6 GB, and 25 directives on 1000 nodes with a 1 hour agent execution period with the default reports retention policy would take 5 to 8 GB.

PostgreSQL database bloating

PostgreSQL database can grow over time, even if the number of nodes and directives remain the same. This is because even if the database is regularly cleaned by Rudder as requested, the physical storage backend does not reclaim space on the hard drive, resulting in a "fragmented" database.

This is often not an issue, as recent versions of PostgreSQL handle this correctly, and new reports sent by the nodes to Rudder should fill the blanks in the database. This task is handled by the auto-vacuum process, which periodically cleans the storage regularly to prevent database bloating.

However, in very rare occasions, the database may grow significantly, resulting in large disk usage, and slower performance, due to massive bloating (with database 3 or 4 times larger than necessary).

To cure (or prevent) this behavior, you can trigger vacuum full operations, which put an exclusive lock on tables, and will lock both the Rudder interface and the reporting system for quite a long time.

Reclaiming space with locking, using VACUUM FULL

Manual vacuuming using the psql binary

# You can either use sudo to change owner to the postgres user, or use the rudder connection credentials.

# With sudo:
sudo -u postgres psql -d rudder

# With rudder credentials, it will ask the password in this case:
psql -u rudder -d rudder

# And then, when you are connected to the rudder database in the psql shell, trigger a vacuum:
rudder# VACUUM FULL ruddersysevents;
rudder# VACUUM FULL archivedruddersysevents;
rudder# VACUUM FULL nodecompliancelevels;

LDAP database reindexing

In very rare case, you will encounter some LDAP database entries that are not indexed and used during searches. In that case, OpenLDAP will output warnings to notify you that they should be.

LDAP database reindexing

# Stop OpenLDAP
systemctl stop rudder-slapd

# Reindex the databases
su -s /bin/sh rudder-slapd -c "/opt/rudder/sbin/slapindex -v -f /opt/rudder/etc/openldap/slapd.conf"

# Restart OpenLDAP
systemctl start rudder-slapd

Server backup and migration

We are only supporting backup and restore on the same exact version of Rudder. Performing a backup and restore on different version of Rudder can cause major breaks that cannot be undone easily, we strongly advise against doing so.

It is advised to backup frequently your Rudder installation in case of a major outage.

These procedures will explain how to backup your Rudder installation.

Backup

This backup procedure will operate on principal Rudder data sources:

The LDAP database
The PostgreSQL database
The configuration-repository folder
Rudder configuration
Rudder certificates

It will also backup the application logs.

How to backup a Rudder installation

# Where you want to put the backups
cd /tmp/backup

# Stop the rudder services
rudder agent disable
systemctl stop rudder-agent rudder-server rudder-relayd

# First, backup the LDAP database:
/opt/rudder/sbin/slapcat -l rudder-backup-$(date +%Y%m%d).ldif

# Second, the PostgreSQL database:
sudo -u postgres pg_dump -Fc rudder > rudder-backup-$(date +%Y%m%d).sql

# Third, backup the configuration repository:
tar -C /var/rudder -zcf rudder-backup-$(date +%Y%m%d).tar.gz configuration-repository/ cfengine-community/ppkeys/

# These may not exist
[ -d /var/rudder/packages ] && tar -C /var/rudder -zcf rudder-backup-packages-$(date +%Y%m%d).tar.gz packages/
[ -d /var/rudder/plugin-resources ] && tar -C /var/rudder -zcf rudder-backup-plugin-resources-$(date +%Y%m%d).tar.gz plugin-resources/
[ -d /opt/rudder/share/plugins ] && tar -C /opt/rudder -zcf rudder-backup-plugin-share-$(date +%Y%m%d).tar.gz share/plugins/

# Then backup Rudder configuration
tar -C /opt/rudder -zcf rudder-etc-backup-$(date +%Y%m%d).tar.gz etc/

# Backup the apache2 Rudder configuration file
tar -C /var/rudder -zcf rudder-apache2-backup-$(date +%Y%m%d).tar.gz /etc/apache2/sites-enabled/

# You will need to read the file /etc/apache2/sites-enabled/rudder.conf to get the location of the certificates and copy them
tar -C /var/rudder -zcf /tmp/rudder-certificates-backup-$(date +%Y%m%d%s).tar.gz /directory/file1.conf /directory/file2.conf

# Finally, backup the logs (if you need them)
tar -C /var/log -zcf rudder-log-backup-$(date +%Y%m%d).tar.gz rudder/

# Restart the services and agent
rudder agent enable
systemctl start rudder-agent rudder-server rudder-relayd

Restore

Of course, after a total machine crash, you will have your backups at hand, but what should you do with it?

Here is the restoration procedure:

How to restore a Rudder backup

# First, follow the standard installation procedure, this one assumes you have a working "blank"
# Rudder on the machine

# Disable Rudder agent
rudder agent disable

# Stop Rudder services
systemctl stop rudder-agent rudder-server rudder-relayd

# Replace apache2 Rudder configuration
tar -C /etc/apache2/sites-enabled/ rudder-apacahe2-backup-XXXXXXXX.tar.gz

# Place the certificates files where they originated according to /etc/apache2/sites-enabled/rudder.conf
tar -xvf rudder-certificates-backup-XXXXXX.tar.gz -C /tmp/

# Drop the OpenLDAP database
rm -rf /var/rudder/ldap/openldap-data/*.mdb

# Import your backups

# Go into the backup folder
cd /tmp/backup

# Configuration repository
tar -C /var/rudder -zxf rudder-backup-XXXXXXXX.tar.gz

# If they exist
tar -C /var/rudder -zxf rudder-backup-packages-XXXXXXXX.tar.gz
tar -C /var/rudder -zxf rudder-backup-plugin-resources-XXXXXXXX.tar.gz
tar -C /opt/rudder -zxf rudder-backup-plugin-share-XXXXXXXX.tar.gz

# LDAP backup
/opt/rudder/sbin/slapadd -l rudder-backup-XXXXXXXX.ldif

# Change ownership of files to rudder-slapd
chown -R rudder-slapd:rudder-slapd /var/rudder/ldap/openldap-data

# Restart PostgreSQL to ensure that no connection remains
systemctl restart postgresql

# PostgreSQL restore
sudo -u postgres dropdb -U postgres rudder
sudo -u postgres pg_restore -d postgres --create < rudder-backup-XXXXXXXX.sql

# Configuration backup
tar -C /opt/rudder -zxf rudder-etc-backup-XXXXXXXX.tar.gz

# Logs backups
tar -C /var/log -zxf rudder-log-backup-XXXXXXXX.tar.gz

# Enable Rudder agent
rudder agent enable

# And restart the machine or just Rudder:
systemctl start rudder-agent rudder-server

Then you need to trigger a full policy regeration in the status menu with the Regenerate all policies button.

Migration

To migrate a Rudder installation, just backup and restore your Rudder installation from one machine to another.

If your server address changed, you will also have to do the following on every node that is directly connected to it (managed nodes or relays):

Remove the server public key rm /var/rudder/cfengine-community/ppkeys/root-MD5=*.pub
Modify the policy server (rudder agent policy-server <new-policy-server>) with the new address, then you can force your nodes to send their inventory by running rudder agent inventory

Relay backup and migration

Backup

This backup procedure will operate on principal Rudder relay data.

It will also backup the application logs.

How to backup a Rudder installation

# Where you want to put the backups
cd /tmp/backup

# Data directory
tar -C /var/rudder -zcf rudder-backup-$(date +%Y%m%d).tar.gz cfengine-community/ppkeys/

# Then backup Rudder configuration
tar -C /opt/rudder -zcf rudder-etc-backup-$(date +%Y%m%d).tar.gz etc/

# Finally, backup the logs (if you need them)
tar -C /var/log -zcf rudder-log-backup-$(date +%Y%m%d).tar.gz rudder/

Restore

Of course, after a total machine crash, you will have your backups at hand, but what should you do with it?

Here is the restoration procedure:

How to restore a Rudder backup

# First, follow the standard installation procedure, this one assumes you have a working "blank"
# Rudder on the machine

# Disable Rudder agent
rudder agent disable

# Stop Rudder services
systemctl stop rudder-agent
# or depending on the system
service rudder-agent stop

# Import your backups

# Go into the backup folder
cd /tmp/backup

# Data repository
tar -C /var/rudder -zxf rudder-backup-XXXXXXXX.tar.gz

# Configuration backup
tar -C /opt/rudder -zxf rudder-etc-backup-XXXXXXXX.tar.gz

# Logs backups
tar -C /var/log -zxf rudder-log-backup-XXXXXXXX.tar.gz

# Enable Rudder agent
rudder agent enable

# Run the agent to configure authorizations on Postgres
rudder agent run

# And restart Rudder:
systemctl start rudder-agent
# or depending on the system
service rudder-agent restart

# Reenable the plugins
rudder package plugin enable-all

Migration

To migrate a Rudder relay installation, just backup and restore your Rudder relay from one machine to another.

If your relay address changed, you will also have to do the following on every node that is directly connected to it (managed nodes or relays):

Remove the relay public key rm /var/rudder/cfengine-community/ppkeys/{RELAY_UUID}-MD5=*.pub
Modify the policy server (rudder agent policy-server <new-policy-server>) with the new address, then you can force your nodes to send their inventory by running rudder agent inventory

Agent backup and migration

Backup

This backup procedure will operate on principal Rudder agent data.

How to backup a Rudder installation

# Where you want to put the backups
cd /tmp/backup

# Data directory
tar -C /var/rudder -zcf rudder-backup-$(date +%Y%m%d).tar.gz cfengine-community/ppkeys/

# Then backup Rudder configuration
tar -C /opt/rudder -zcf rudder-etc-backup-$(date +%Y%m%d).tar.gz etc/

Restore

Of course, after a total machine crash, you will have your backups at hand, but what should you do with it?

Here is the restoration procedure:

How to restore a Rudder backup

# First, follow the standard installation procedure, this one assumes you have a working "blank"
# Rudder on the machine

# Disable Rudder agent
rudder agent disable

# Stop Rudder services
systemctl stop rudder-agent
# or depending on the system
service rudder-agent stop

# Import your backups

# Go into the backup folder
cd /tmp/backup

# Data repository
tar -C /var/rudder -zxf rudder-backup-XXXXXXXX.tar.gz

# Configuration backup
tar -C /opt/rudder -zxf rudder-etc-backup-XXXXXXXX.tar.gz

# Enable Rudder agent
rudder agent enable

# Run the agent to configure authorizations on Postgres
rudder agent run

# And restart Rudder:
systemctl start rudder-agent
# or depending on the system
service rudder-agent restart

Migration

To migrate a Rudder agent installation, just backup and restore your agent from one machine to another.

Password management

Rudder uses a central file to manage the passwords that will be used by the application: /opt/rudder/etc/rudder-passwords.conf.

In the package, this file is initialized with default values, and during postinstall it will be updated with randomly generated passwords.

On the majority of cases, this is fine, however you might want to adjust the passwords manually. This is possible, just be cautious when editing the file, as if you corrupt it Rudder will not be able to operate correctly anymore and will spit numerous errors in the program logs.

As of now, this file follows a simple syntax: ELEMENT:password

You are able to configure three passwords in it: The OpenLDAP one, the PostgreSQL one and the authenticated WebDAV one.

If you edit this file, Rudder will take care of applying the new passwords everywhere it is needed, however it will restart the application automatically when finished, so take care of notifying users of potential downtime before editing passwords.

Here is a sample command to regenerate the WebDAV password with a random password, that is portable on all supported systems. Just change the RUDDER_WEBDAV_PASSWORD to any password file statement corresponding to the password you want to change.

sed -i s/RUDDER_WEBDAV_PASSWORD.*/RUDDER_WEBDAV_PASSWORD:$(dd if=/dev/urandom count=128 bs=1 2>&1 | md5sum | cut -b-20)/ /opt/rudder/etc/rudder-passwords.conf

← User management Performance tuning →