The administration of a large number of servers can be quite tiresome without a central configuration management. This article gives a first introduction into the configuration management tool, Puppet.
Introduction
In our daily work at the Open Source Support Center we maintain a large number of servers. Managing larger clusters or setups means maintaining dozens of machines with an almost identical configuration and only slight variations, if any. Without central configuration management, making small changes to the configuration would mean repeating the same step on all machines. This is where Puppet comes into play.
As with all configuration management tools, Puppet uses a central server which manages the configuration. The clients query the server on a regular basis for new configuration via an encrypted connection. If a new configuration is found, it is imported as the server instructs: the client imports new files, modifies rights, starts services and executes commands, whatever the server says. The advantages are obvious:
- Each configuration change is done only once, regardless of the actual number of maintained servers. Unnecessary – and pretty boring – repetition is avoided, lucky us!
- The configuration is streamlined for all machines, which makes it much easier to maintain.
- A central infrastructure makes it easier to quickly get an overview about the setup – “running around” is not necessary anymore.
- Last but not least, a central configuration tree enables you to incorporate a simple version control of your configuration: for example, playing back the configuration “PRE-UPDATE” on all machines of an entire setup only takes a couple of commands!
Technical workflow
Puppet consists of a central server, called “Puppet Master”, and the clients, called “Nodes”. The nodes query the master for the current configuration. The master responds with a list of configuration and management items: files, services which have to be running, commands which need to be executed, and so on – the possibilities are practically endless:
- The master can hand over files which the node copies to a defined place – if it does not already exist.
- The node is asked to check certain file and directory permissions and to correct them if necessary.
- Depending upon the operating system, the node checks the state of services and starts or stops them. It can also check for installed packages and if they are up to date.
- The master can force the node to execute arbitrary commands
- etc.
Of course, in general all tasks can be fulfilled by handing over files from the master to the client. However, in more complex setups this kind of behaviour is not easily arranged, nor does it simplify the setup. Puppet’s strength is that it facilitates abstract system tasks (restart services, ensure installed packages, add users, etc.), regardless of the actual changed files in the background. You can even use the same statement in Puppet to configure different versions of Linux or Unix.
Installation
First, you need the master, the center of all the configuration you want to manage: apt-get install puppetmaster
Puppet expects that all machines in the network have FQDNs – but that should be the case anyway in a well maintained network.
Other machines become a node by installing the Puppet client: apt-get install puppet
Puppet, main configuration
The Puppet nodes do not need to be configured – they will check for a machine called Puppet in the local network. As long as that name points to the master you do not have to do anything else.
Since the master provides files to the nodes, the internal file server must be configured accordingly. There are different solutions for the internal file server, depending on the needs of your setup. For example, it might be better for your setup to store all files you provide to the nodes on one place, and the actual configuration you provide to the nodes somewhere else. However, in our example we keep the files and the configuration for the nodes close, as it is outlined in Puppet’s Best Practice Guide and in the Module Configuration part of the Puppet documentation. Thus, it is enough to change the file /etc/puppet/fileserver.conf to:
[modules] allow 192.168.0.1/24 allow *.credativ.de
Configuration of the configuration – Modules
Puppet’s way of managing configuration is to use sets of tasks grouped by topic. For example, all tasks related to SSH should go into the module “ssh”, while all tasks related to apache should be placed in the module “apache” and so on. These sets of tasks are called “Modules” and are the core of Puppet – in a perfect Puppet setup everything is defined in modules! We will explain the structure of a SSH module to highlight the basics and ideas behind Puppet’s modules. We will also try to stay close to the Best Practise Guide to make it easier to check back against the Puppet documentation.
Please note, however, that this example is an example: in a real world setup the SSH configuration would be a bit more dynamic, but we focused on simple and easy-to-understand methods.
The SSH module
We have the following requirements:
- The package open-ssh must be installed and be the newest version.
- Each node’s sshd_config file has to be the same as the one saved on the master.
- In the event that the sshd_config is changed on any node, the sshd service should be restarted.
- The user credativ needs to have certain files in his/her directory $HOME/.ssh.
To comply with these requirements we start by creating some necessary paths:
mkdir -p /etc/puppet/modules/ssh/manifests mkdir -p /etc/puppet/modules/ssh/files
The directory “manifests” contains the actual configuration instructions of the module and the directory “files” provides the files we hand over to the clients.
The instructions themselves are written down in init.pp in the “manifests” directory. The set of instructions to fulfil aims 1 – 4 are grouped in a so called “class”. For each task a “class” has one subsection, a type. So in our case we have four types, one for each aim:
class ssh{ package { "openssh-server": ensure => latest, } file { "/etc/ssh/sshd_config": owner => root, group => root, mode => 644, source => "puppet:///ssh/sshd_config", } service { ssh: ensure => running, hasrestart => true, subscribe => File["/etc/ssh/sshd_config"], } file { "/home/credativ/.ssh": path => "/home/credativ/.ssh", owner => "credativ", group => "credativ", mode => 600, recurse => true, source => "puppet:///ssh/ssh", ensure => [directory, present], } }
Each type is another task and calls another action on the node:
package
Here we make sure that the package openssh-server is installed in the newest version.
file
A file on the node is compared with the version on the server and overwritten if necessary. Also, the rights are adjusted.
service
Well, as the name says, this deals with services: in our case the service must be running on the node. Also, in case the file /etc/ssh/sshd_config is modified, the service is restarted automatically.
file
Here we have again the file type, but this time we do not compare a file, but an entire directory.
As mentioned above, the files and directories you configured so that the server provides them to the nodes must be available in the directory /etc/puppet/modules/ssh/files/.
Nodes and modules
We now have three parts: the master, the nodes and the modules. The next step is to tell the master which nodes are related to which modules. First, you must tell the master that this module exists in /etc/puppet/manifests/modules.pp:
import "ssh"
Next, you need to modify /etc/puppet/manifests/nodes.pp. This specifies which module is loaded for which node, and which modules should be loaded as default in the event that a node does not have a special entry. The entries for the nodes support inheritance.
So, for example, to have the module “rsyslog” ready for all nodes but the module “ssh” only ready for the node “external” you need the following entry:
node default { include rsyslog } node 'external' inherits default { include ssh }
Puppet is now configured!
Certificates – secured communication between nodes and master
As mentioned above, the communication between master and node is encrypted. But that implies you have to verify the partners at least once. This can be done after a node queries the master for the first time. Whenever the master is queried by an unknown node it does not provide the default configuration but instead puts the node on a waiting list. You can check the waiting list with the command: # puppetca --list
To verify a node and incorporate it into the Puppet system you need to verify it: # puppetca --sign external.example.com
The entire process is explained in more detail in the puppet doceumentation.
Closing words
The example introduced in this article is very simple – as I noted, a real world example would be more complex and dynamic. However, it is a good way to start with Puppet, and the documentation linked throughout this article will help the willing reader to dive deeper into the components of Puppet.
We, here at credativ’s Open Source Support Center have gained considerable experience with Puppet in recent years and really like the framework. Also, in our day to day support and consulting work we see the market growing as more and more customers are interested in the framework. Right now, Puppet is in the fast lane and it will be interesting to see how more established solutions like cfengine will react to this competition.
This post was originally written by Roland Wolters.