Unimus Core HA deploy — a how-to guide

Components of the cluster

  • Linux — our base operating system that our cluster nodes run.
  • Corosync — Provides cluster node membership and status information. Notifies of nodes joining/leaving cluster and provides quorum.
  • Pacemaker — Cluster resource manager (CRM). Uses the information from Corosync to manage cluster resources and their availability.
  • pcs - A helper utility that interfaces with Corosync (corosync.conf) and Pacemaker (cib.xml) to manage a cluster.
  • Unimus Core — Our service we want to have highly available.

Preparations

# run everything as root
sudo su

# update
apt-get update && apt-get upgrade -y

# install dependencies
apt-get install -y \
wget \
curl \
corosync \
pacemaker \
pcs

# install Unimus Core in unattended mode
wget https://unimus.net/install-unimus-core.sh && \
chmod +x install-unimus-core.sh && \
./install-unimus-core.sh -u

# setup Unimus Core config file
cat <<- "EOF" > /etc/unimus-core/unimus-core.properties
unimus.address = your_server_address_here
unimus.port = 5509
unimus.access.key = your_access_key
logging.file.count = 9
logging.file.size = 50
EOF
CLUSTER_PWD="please_insert_strong_password_here"
echo "hacluster:$CLUSTER_PWD" | chpasswd
CLUSTER_PWD="please_insert_strong_password_here"

# setup cluster
pcs cluster auth test-core1.net.internal test-core2.net.internal -u hacluster -p "$CLUSTER_PWD" --force
pcs cluster setup --name unimus_core_cluster test-core1.net.internal test-core2.net.internal --force

# start cluster
pcs cluster enable --all
pcs cluster start --all
pcs property set no-quorum-policy=ignore
pcs property set stonith-enabled=false
pcs property list
pcs status
root@test-core1:~# pcs status
Cluster name: unimus_core_cluster
Stack: corosync
Current DC: test-core1 (version 1.1.18-2b07d5c5a9) - partition with quorum
Last updated: Tue Mar 4 01:08:51 2022
Last change: Tue Mar 4 01:04:49 2022 by hacluster via crmd on test-core1

2 nodes configured
0 resources configured

Online: [ test-core1 test-core2 ]

No resources


Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
root@test-core1:~#

Troubleshooting

  • Your cluster nodes should NOT be behind NAT (this is possible, but requires more config not covered in this guide).
  • You must use hostnames / FQDNs for cluster nodes. Using IPs is a no-go. If needed, create hostnames for cluster nodes in /etc/hosts.
  • The hostname / FQDN you used resolves to 127.0.0.1, or a different loopback. This is also a no-go as Corosync / Pacemaker require that the hostnames / FQDNs used for clustering resolve to actual cluster member IPs.

Creating a cluster resource

# disable Core autostart, Pacemaker will control this
systemctl stop unimus-core
systemctl disable unimus-core
# we might want to set node as ineligible to run the service if it fails to start
pcs resource defaults migration-threshold=1

# setup our cluster resource
pcs resource create unimus_core systemd:unimus-core op start timeout="30s" op monitor interval="10s"

Monitoring cluster resources

pcs status resources
root@test-core1:~# pcs status resources
unimus_core (systemd:unimus-core): Started test-core1
root@test-core1:~#
# on core1
root@test-core1:~# systemctl status unimus-core
● unimus-core.service - Cluster Controlled unimus-core
Loaded: loaded (/etc/systemd/system/unimus-core.service; disabled; vendor preset: enabled)
Drop-In: /run/systemd/system/unimus-core.service.d
└─50-pacemaker.conf
Active: active (running)
...
root@test-core1:~#

# on core2
root@test-core2:~# systemctl status unimus-core
● unimus-core.service - Unimus Remote Core
Loaded: loaded (/etc/systemd/system/unimus-core.service; disabled; vendor preset: enabled)
Active: inactive (dead)
root@test-core2:~#

Live monitoring of cluster status

Simulating a failure

crm_resource --resource unimus_core --force-stop
root@test-core1:~# pcs status resources
unimus_core (systemd:unimus-core): Started test-core2
root@test-core1:~#
pcs resource cleanup unimus-core
# force a move to another cluster member
crm_resource --resource unimus_core --move

# clear any resource constraints we created
crm_resource --resource unimus_core --clear
crm_resource --resource unimus_core --constraints

Final words

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store