Ansible Archive

Efficient Proxmox Cluster Deployment through Automation with Ansible

Manually setting up and managing servers is usually time-consuming, error-prone, and difficult to scale. This becomes especially evident during large-scale rollouts, when building complex infrastructures, or during the migration from other virtualization environments. In such cases, traditional manual processes quickly reach their limits. Consistent automation offers an effective and sustainable solution to these challenges.

Proxmox is a powerful virtualization platform known for its flexibility and comprehensive feature set. When combined with Ansible, a lightweight and agentless automation tool, the management of entire system landscapes becomes significantly more efficient. Ansible allows for the definition of reusable configurations in the form of playbooks, ensuring that deployment processes are consistent, transparent, and reproducible.

To enable fully automated deployment of Proxmox clusters, our team member, known in the open-source community under the alias gyptazy, has developed a dedicated Ansible module called proxmox_cluster. This module handles all the necessary steps to initialize a Proxmox cluster and add additional nodes. It has been officially included in the upstream Ansible Community Proxmox collection and is available for installation via Ansible Galaxy starting with version 1.1.0. As a result, the manual effort required for cluster deployment is significantly reduced. Further insights can be found in his blog post titled “How My BoxyBSD Project Boosted the Proxmox Ecosystem“.

By adopting this solution, not only can valuable time be saved, but a solid foundation for scalable and low-maintenance infrastructure is also established. Unlike fragile task-based approaches that often rely on Ansible’s shell or command modules, this solution leverages the full potential of the Proxmox API through a dedicated module. As a result, it can be executed in various scopes and does not require SSH access to the target systems.

This automated approach makes it possible to deploy complex setups efficiently while laying the groundwork for stable and future-proof IT environments. Such environments can be extended at a later stage and are built according to a consistent and repeatable structure.

Benefits

Using the proxmox_cluster module for Proxmox cluster deployment brings several key advantages to modern IT environments. The focus lies on secure, flexible, and scalable interaction with the Proxmox API, improved error handling, and simplified integration across various use cases:

Use of the native Proxmox API
Full support for the Proxmox authentication system
- API Token Authentication support
No SSH access required
Usable in multiple scopes:
- From a dedicated deployment host
- From a local system
- Within the context of the target system itself
Improved error handling through API abstraction

Ansible Proxmox Module: proxmox_cluster

The newly added proxmox_cluster module in Ansible significantly simplifies the automated provisioning of Proxmox VE clusters. With just a single task, it enables the seamless creation of a complete cluster, reducing complexity and manual effort to a minimum.

Creating a Cluster

Creating a cluster requires now only a single task in Ansible by using the proxmox_cluster module:

- name: Create a Proxmox VE Cluster
  community.proxmox.proxmox_cluster:
    state: present
    api_host: proxmoxhost
    api_user: root@pam
    api_password: password123
    api_ssl_verify: false
    link0: 10.10.1.1
    link1: 10.10.2.1
    cluster_name: "devcluster"

Afterwards, the cluster is created and additional Proxmox VE nodes can join the cluster.

Joining a Cluster

Additional nodes can now also join the cluster using a single task. When combined with the use of a dynamic inventory, it becomes easy to iterate over a list of nodes from a defined group and add them to the cluster within a loop. This approach enables the rapid deployment of larger Proxmox clusters in an efficient and scalable manner.

- name: Join a Proxmox VE Cluster
  community.proxmox.proxmox_cluster:
    state: present
    api_host: proxmoxhost
    api_user: root@pam
    api_password: password123
    master_ip: "{{ primary_node }}"
    fingerprint: "{{ cluster_fingerprint }}"
    cluster_name: “devcluster"

Cluster Join Informationen

In order for a node to join a Proxmox cluster, the cluster’s join information is generally required. To avoid defining this information manually for each individual cluster, this step can also be automated. As part of this feature, a new module called cluster_join_info has been introduced. It allows the necessary data to be retrieved automatically via the Proxmox API and made available for further use in the automation process.

- name: List existing Proxmox VE cluster join information
  community.proxmox.proxmox_cluster_join_info:
    api_host: proxmox1
    api_user: root@pam
    api_password: "{{ password | default(omit) }}"
    api_token_id: "{{ token_id | default(omit) }}"
    api_token_secret: "{{ token_secret | default(omit) }}"
  register: proxmox_cluster_join

Conclusion

While automation in the context of virtualization technologies is often focused on the provisioning of guest systems or virtual machines (VMs), this approach demonstrates that automation can be applied at a much deeper level within the underlying infrastructure. It is also possible to fully automate scenarios in which nodes are initially deployed using a customer-specific image with Proxmox VE preinstalled, and then proceed to automatically create the cluster.

As an official Proxmox partner, we are happy to support you in implementing a comprehensive automation strategy tailored to your environment and based on Proxmox products. You can contact us at any time!

Patroni is a PostgreSQL high availability solution with a focus on containers and Kubernetes. Until recently, the available Debian packages had to be configured manually and did not integrate well with the rest of the distribution. For the upcoming Debian 10 “Buster” release, the Patroni packages have been integrated into Debian’s standard PostgreSQL framework by credativ. They now allow for an easy setup of Patroni clusters on Debian or Ubuntu.

Patroni employs a “Distributed Consensus Store” (DCS) like Etcd, Consul or Zookeeper in order to reliably run a leader election and orchestrate automatic failover. It further allows for scheduled switchovers and easy cluster-wide changes to the configuration. Finally, it provides a REST interface that can be used together with HAProxy in order to build a load balancing solution. Due to these advantages Patroni has gradually replaced Pacemaker as the go-to open-source project for PostgreSQL high availability.

However, many of our customers run PostgreSQL on Debian or Ubuntu systems and so far Patroni did not integrate well into those. For example, it does not use the postgresql-common framework and its instances were not displayed in pg_lsclusters output as usual.

Integration into Debian

In a collaboration with Patroni lead developer Alexander Kukushkin from Zalando the Debian Patroni package has been integrated into the postgresql-common framework to a large extent over the last months. This was due to changes both in Patroni itself as well as additional programs in the Debian package. The current Version 1.5.5 of Patroni contains all these changes and is now available in Debian “Buster” (testing) in order to setup Patroni clusters.

The packages are also available on apt.postgresql.org and thus installable on Debian 9 “Stretch” and Ubuntu 18.04 “Bionic Beaver” LTS for any PostgreSQL version from 9.4 to 11.

The most important part of the integration is the automatic generation of a suitable Patroni configuration with the pg_createconfig_patroni command. It is run similar to pg_createcluster with the desired PostgreSQL major version and the instance name as parameters:

pg_createconfig_patroni 11 test

This invocation creates a file /etc/patroni/11-test.yml, using the DCS configuration from /etc/patroni/dcs.yml which has to be adjusted according to the local setup. The rest of the configuration is taken from the template /etc/patroni/config.yml.in which is usable in itself but can be customized by the user according to their needs. Afterwards the Patroni instance is started via systemd similar to regular PostgreSQL instances:

systemctl start patroni@11-test

A simple 3-node Patroni cluster can be created and started with the following few commands, where the nodes pg1, pg2 and pg3 are considered to be hostnames and the local file dcs.yml contains the DCS configuration:


for i in pg1 pg2 pg3; do ssh $i 'apt -y install postgresql-common'; done
for i in pg1 pg2 pg3; do ssh $i 'sed -i "s/^#create_main_cluster = true/create_main_cluster = false/" /etc/postgresql-common/createcluster.conf'; done
for i in pg1 pg2 pg3; do ssh $i 'apt -y install patroni postgresql'; done
for i in pg1 pg2 pg3; do scp ./dcs.yml $i:/etc/patroni; done
for i in pg1 pg2 pg3; do ssh @$i 'pg_createconfig_patroni 11 test' && systemctl start patroni@11-test'; done

Afterwards, you can get the state of the Patroni cluster via

ssh pg1 'patronictl -c /etc/patroni/11-patroni.yml list'
+---------+--------+------------+--------+---------+----+-----------+
| Cluster | Member |    Host    |  Role  |  State  | TL | Lag in MB |
+---------+--------+------------+--------+---------+----+-----------+
| 11-test |  pg1   | 10.0.3.111 | Leader | running |  1 |           |
| 11-test |  pg2   | 10.0.3.41  |        | stopped |    |   unknown |
| 11-test |  pg3   | 10.0.3.46  |        | stopped |    |   unknown |
+---------+--------+------------+--------+---------+----+-----------+

Leader election has happened and pg1 has become the primary. It created its instance with the Debian-specific pg_createcluster_patroni program that runs pg_createcluster in the background. Then the two other nodes clone from the leader using the pg_clonecluster_patroni program which sets up an instance using pg_createcluster and then runs pg_basebackup from the primary. After that, all nodes are up and running:

+---------+--------+------------+--------+---------+----+-----------+
| Cluster | Member |    Host    |  Role  |  State  | TL | Lag in MB |
+---------+--------+------------+--------+---------+----+-----------+
| 11-test |  pg1   | 10.0.3.111 | Leader | running |  1 |         0 |
| 11-test |  pg2   | 10.0.3.41  |        | running |  1 |         0 |
| 11-test |  pg3   | 10.0.3.46  |        | running |  1 |         0 |
+---------+--------+------------+--------+---------+----+-----------+

The well-known Debian postgresql-common commands work as well:

ssh pg1 'pg_lsclusters'
Ver Cluster Port Status Owner    Data directory                 Log file
11  test    5432 online postgres /var/lib/postgresql/11/test    /var/log/postgresql/postgresql-11-test.log

Failover Behaviour

If the primary is abruptly shutdown, its leader token will expire after a while and Patroni will eventually initiate failover and a new leader election:

+---------+--------+-----------+------+---------+----+-----------+
| Cluster | Member |    Host   | Role |  State  | TL | Lag in MB |
+---------+--------+-----------+------+---------+----+-----------+
| 11-test |  pg2   | 10.0.3.41 |      | running |  1 |         0 |
| 11-test |  pg3   | 10.0.3.46 |      | running |  1 |         0 |
+---------+--------+-----------+------+---------+----+-----------+
[...]
+---------+--------+-----------+--------+---------+----+-----------+
| Cluster | Member |    Host   |  Role  |  State  | TL | Lag in MB |
+---------+--------+-----------+--------+---------+----+-----------+
| 11-test |  pg2   | 10.0.3.41 | Leader | running |  2 |         0 |
| 11-test |  pg3   | 10.0.3.46 |        | running |  1 |         0 |
+---------+--------+-----------+--------+---------+----+-----------+
[...]
+---------+--------+-----------+--------+---------+----+-----------+
| Cluster | Member |    Host   |  Role  |  State  | TL | Lag in MB |
+---------+--------+-----------+--------+---------+----+-----------+
| 11-test |  pg2   | 10.0.3.41 | Leader | running |  2 |         0 |
| 11-test |  pg3   | 10.0.3.46 |        | running |  2 |         0 |
+---------+--------+-----------+--------+---------+----+-----------+

The old primary will rejoin the cluster as standby once it is restarted:

+---------+--------+------------+--------+---------+----+-----------+
| Cluster | Member |    Host    |  Role  |  State  | TL | Lag in MB |
+---------+--------+------------+--------+---------+----+-----------+
| 11-test |  pg1   | 10.0.3.111 |        | running |    |   unknown |
| 11-test |  pg2   | 10.0.3.41  | Leader | running |  2 |         0 |
| 11-test |  pg3   | 10.0.3.46  |        | running |  2 |         0 |
+---------+--------+------------+--------+---------+----+-----------+
[...]
+---------+--------+------------+--------+---------+----+-----------+
| Cluster | Member |    Host    |  Role  |  State  | TL | Lag in MB |
+---------+--------+------------+--------+---------+----+-----------+
| 11-test |  pg1   | 10.0.3.111 |        | running |  2 |         0 |
| 11-test |  pg2   | 10.0.3.41  | Leader | running |  2 |         0 |
| 11-test |  pg3   | 10.0.3.46  |        | running |  2 |         0 |
+---------+--------+------------+--------+---------+----+-----------+

If a clean rejoin is not possible due to additional transactions on the old timeline the old primary gets re-cloned from the current leader. In case the data is too large for a quick re-clone, pg_rewind can be used. In this case a password needs to be set for the postgres user and regular database connections (as opposed to replication connections) need to be allowed between the cluster nodes.

Creation of additional Instances

It is also possible to create further clusters with pg_createconfig_patroni, one can either assign a PostgreSQL port explicitly via the --port option, or let pg_createconfig_patroni assign the next free port as is known from pg_createcluster:

for i in pg1 pg2 pg3; do ssh $i 'pg_createconfig_patroni 11 test2 && systemctl start patroni@11-test2'; done
ssh pg1 'patronictl -c /etc/patroni/11-test2.yml list'
+----------+--------+-----------------+--------+---------+----+-----------+
| Cluster  | Member |       Host      |  Role  |  State  | TL | Lag in MB |
+----------+--------+-----------------+--------+---------+----+-----------+
| 11-test2 |  pg1   | 10.0.3.111:5433 | Leader | running |  1 |         0 |
| 11-test2 |  pg2   |  10.0.3.41:5433 |        | running |  1 |         0 |
| 11-test2 |  pg3   |  10.0.3.46:5433 |        | running |  1 |         0 |
+----------+--------+-----------------+--------+---------+----+-----------+

Ansible Playbook

In order to easily deploy a 3-node Patroni cluster we have created an Ansible playbook on Github. It automates the installation and configuration of PostgreSQL and Patroni on the three nodes, as well as the DCS server on a fourth node.

Questions and Help

Do you have any questions or need help? Feel free to write to info@credativ.com.