In today's IT landscape Automation is seen as the route forward by many. From the consistent builds of systems, servers and desktops to the reduction of IT costs doing it in an Automated fashion is the way to go.
Because of this push for automation there are a myriad of choices when it comes to tools for automating things in IT.
Puppet, Packer, Chef, Ansible, Jenkins, Foreman, Rundeck, Bash scripts there are simply so so many ways to automate entire systems or small tasks.
This article is adding one more to the mix, and readers may ask the question, what's the point? You can do X, Y or Z with <enter application here> to which it is hard to mount a grilling defence to and tell you why you are wrong. All of the tools I've listed above do a core thing very well, but used right can be used to do many other things as well.
As an example, Ansible is amazing at deploying applications however it can be if managed right used for configuration management checking too. However Puppet is amazing at ensuring your configurations are kept consistant, it too can however install applications.
So with all these tools out there where does rudder.io fit in?
Rudder is all about a specific type of scenario...
You have automation pipelines in place, you have unit testing on those pipelines, you're happy that the pipelines have been tested in dev, integration and preproduction so you roll out the pipeline to your production environment and all looks good.
A few weeks later there is a problem in production and after lots and lots of troubleshooting it turns out that last change caused a service not assoicated with the pipeline to fail.
This is where Rudder comes in, it's a tool for compliance and continuous audit. Rudder is a solution for checking that your infrastucture is running as expected.
Essentially managing your infrastucture using automation no matter what the tool you use is the right way to go, the ongoing issue despite how much testing you do is, how do you know your infrastuture is working as expected after the automation is applied?
The answer is you need some form of audit.
Again, there will be people reading this who will comment that the tool you use and your testing should pick this up, or you should have good monitoring in place to catch these things? I have no pushback at all, other than the reality of life for many systems is that the dev environments don't match production 100%, while a preprod may be in place which does, the automation will usually be focussed on the item its automating and not all monitoring solutions are built equal.
Also, this is just another tool, and the more things you have telling you there are a problem, the better, depending on how they are telling you?
So back to Rudder.
Rudder is based on an agent server model, it supports most OS's out there, it presents itself using a web interface and has API endpoints available.
The Website states the following:
A simple framework allows you to extend the built-in rules to implement specific low-level configuration patterns, however complex they may be, using simple building blocks (ensure package installed in version X, ensure file content, ensure line in file, etc.). A graphical builder lowers the technical level required to use this.
Each policy can be independently set to be automatically checked or enforcedon a policy or host level. In Enforce mode, each remediation action is recorded, showing the value of these invisible fixes.
Rudder works on almost every kind of device, so you’ll be managing physical and virtual servers in the data center, cloud instances, and embedded IoT devices in the same way.
Rudder is designed for critical environments where a security breach can mean more than a blip in the sales stats. Built-in features include change requests, audit logs, and strong authentication.
Rudder relies on an agent that needs to be installed on all hosts to audit. The agent is very lightweight (10 to 20 MB of RAM at peak) and blazingly fast(it’s written in C and takes less than 10 seconds to verify 100 rules). Installation is self-contained, via a single package, and can auto-update to limit agent management burden.
Rudder is a true and professional open source solution—the team behind Rudder doesn’t believe in the dual-speed licensing approach that makes you reinstall everything and promotes open source as little more than a “demo version.”
So how do we get this all setup?
Running on Ubuntu 20.04 monitoring 11 endpoints on a single lan
- 4Gb Ram
- 50Gb HDD
- CPU 1, Cores 2
Install a root server
I setup Rudder on a small test Ubuntu 20.04 VM running on Proxmox which would be used to check 10 machines.
The Rudder website has very clear details on installing the root server where the install is run from and I'd strongly suggest using the links here for the most up to date instructions
Each official package is signed with our GPG signature. To ensure the packages you will install are official builds and have not been altered, import our key into apt using the following command:
wget --quiet -O- "https://repository.rudder.io/apt/rudder_apt_key.pub" | sudo apt-key add -
The Rudder key fingerprint is:
pub 4096R/474A19E8 2011-12-15 Rudder Project (release key) <email@example.com> Key fingerprint = 7C16 9817 7904 212D D58C B4D1 9322 C330 474A 19E8
Add Rudder’s package repository:
# If lsb_release is not installed on your machine, change $(lb_release -cs) by your distribution codename. # Ex: # stretch for Debian 9 # bionic for Ubuntu 18.04 LTS echo "deb http://repository.rudder.io/apt/6.1/ $(lsb_release -cs) main" > /etc/apt/sources.list.d/rudder.list
Update your local package database to retrieve the list of packages available on our repository:
sudo apt-get update
To begin the installation, you should simply install the rudder-server-root metapackage, which will install the required components:
sudo apt-get install rudder-server-root
I did have issues with the metapackage, at the time of writing the highest supported version was Ubuntu 18.04, however there was a specific package it was having problems installing, and I attempted to install that from a separate command line, then ran the above command and it all installed ok.
Now the installation is complete you need to create a first user account. The easiest way is to use the dedicated command to create a local admin user:# Replace USERNAME by the user you want to create
sudo rudder server create-user -u USERNAME
It will ask you a password twice and will create the user.
Once all these steps have been completed, use your web browser to go to the server URL. Use your first user credentials to connect.
Now you should go to Settings → General → Allowed Networks and check that the networks listed there properly include all your nodes network addresses. By default this will contain your server’s attached networks.
In my case this had the servers IP Address in it example
I changed the field to read
Save this setting and your server is now setup
However your dashboard will be empty so we need to add agents to the clients which will be scanned by rudder.
Installing the agents is very scriptable from Bash (very meta automation for the tool checking your automation) the basic install for an Ubuntu 20.04 server would be
sudo wget --quiet -O- "https://repository.rudder.io/apt/rudder_apt_key.pub" | sudo apt-key add -
sudo nano /etc/apt/sources.list.d/rudder.list
Add the following text
deb http://repository.rudder.io/apt/6.1/ focal main
sudo apt update
sudo apt-get install rudder-agent
Point the agent at the server which was just installed.
sudo rudder agent policy-server 192.168.100.12
Run an inventory check
sudo rudder agent inventory
sudo rudder agent run
The site instructions can be found here
You have run the agent setup on the servers, they need to be accepted, which fits in well with the whole audited solution.
Login to your rudder server (as an example)
Use the username/password setup earlier and on the menu to the left, click on Accept new nodes
This will open the following screen
If (like this screen shot) you see no new nodes, you may have a firewall issue.
You should be able to select the new nodes and click on the green accept button. Now click on the List Nodes menu option
You will see a list of available nodes
Node Management is covered at
At this point however the Dashboard may not show any useful data and you may have to give Rudder about 10 minutes to collate the data, inventory and base comliance check for each node.
Before setting up checks on systems, its useful to create groups of servers to run the checks on because you probably won't end up wanting to run all the checks on all the servers
Groups can either be static (contain the devices found in the intial search) or dynamic (devices added later which pass the search critera will be added to the group)
Groups are associated with the rules which we will setup nextand as such they should be setup before the rules are created.
Clicking on groups in the left hand menu will open the groups screen
You will notice there are lots of pre existing groups setup and these are system level groups, as the name suggests, setup by the system.
User groups are ones which we can setup by clicking on the Create button.
The Create a new item box will pop up, and here we can choose the type of group You'll need to provide a name for the group, a description, nest if needed
Every time a change is made to Rudder it does so with an audit message which can be viewed in the event log.
Once the new group is created, select it from the tree on the left, and select Criteria on the top menu
Part of the reason for not being fussed on selecting Dynamic or Static when you create the group is you can change it here.
Then a set of search options can be created using the drop downs to find the machines you want in the group. As i've choosen to create a group for Ubuntu 20.04 machines I run 2 checks click on seach and the machines are listed under node name.
Click on Save in the top right.
Its possible to do much more with groups, however this is a quck example.
Now i've setup my groups, i'll setup a check to make sure a service is running on the Ubuntu 20.04 servers.
Setup Techniques/Directives and Rules
This was the bit which took a little reading up on, while there are a set of rules which come out of the box for checking on a server, its setting up your own checks which bring out the power of Rudder.
There are 3 layers for doing this and in the menu system I found it easiest to work from the bottom up. So I'll give an example of eching a service is running on a set of devices
A technique is a place where you define a set of actions, you're not so much worried about variables, groups or machines here, just step by step what you want the Technique to do, that could be install software, check for updates, check CVE status a myriad of things
To create a new technique, in the left hand menu click Technique
This will take you to the Technique editor, a screen where you can create your own workflows of things to do
Click on the green create button which will take you to a screen where a Name and a description or any related links to the Technique are stored.
Note: you'll notice that there is a 1.0 next to the Header, this is the version control number.
Click on Save on the bottom right
This will open the Generic Methods screen to the right. This is a very long list of prebuilt things which Rudder can do, this is a searchable list (see: filter generic methods) and you can create your own methods should you so wish.
For this example i want to check that rundeckd service is running on the Ubuntu boxes. So I double click on "service started" to add it to the Technique
The Generic method needs a parameter, in this case a service name rundeckd
Click on Save and this technique is complete
Another example of a Technique is a multiple generic method one where a check is run make sure that specific tools and software are installed on an Ubuntu box, and that specific services are started and running.
Scroll down the Generic Method list and youj'll be hard pressed not to see something of use you could check is installed, running or setup on a server.
Id describe a Directive as a group of techniqes which will be associated with a rule to run on a group of machines
Open Directives in the left hand menu
This will open a list of built in Technique/Directive matchups and when you add your first technique one you'll see it listed under User Techniques
Select the Technique we just created
Click on Create Directive
We can set a short description, tags the priority and if the directive runs in Audit mode where it justs checks to see if the technique is true or Enforce mode where the technique is applied to an endpoint.
You'll note the Technique Version drop down meaning you can test or revert as needed.
Click on save to create the Directive
Finally there are rules, rules are where we join Techniques, Directives and groups into something run on the remote server.
Click on Rules in the left side menu
This will open the main Rules page
There is a catagories section where new Sub Catagories of fules can be created to collate rules and make them easier to manage. I've broken mine down into server type catagories.
Once you have your catagories setup select theone you want to create a rule in and click Create in the top right.
Give your rule a name, description and confirm the right catagory is selected.
A long screen will appear, where if you're using them you can add tags and a short description
Scroll down a little and you will see the Directives and Groups areas where we link a rule to each of these.
Click on Select Directives
Select the Is Rundeck Running Directive by clicking on the green + sign. You can select more than one directive here.
Your rule is now linked to a directive and knows what to run
Next the rule needs to be told which machines to run on
Click on Select Groups
Instead of running my check on all the Ubuntu servers I created a static group just for the rundeck server, as I will run more audits on this as its my central automation server.
Click on the green + next to the Rundeck Servers group, again, you can select multiple groups to run rules against.
The Group has now been added to the rule, and the rule has all it needs to run.
Click on save
This will bring up an information screen for the rule, as the rule has not yet been run, it takes about 2 to 5 minutes depending on the number of servers.
First Check Done
This has now setup your first automated check in Rudder, I've made it purposly simple to go through the workflow as I understand it. All of this can be programmed via the API, and automated itself, so could be added as part of an Ansible Run for example.
Once the rules have been added the Dashboards start to become much more useful
The information can be drilled down futher, so if i click on Nodes in the top left
I generate an overview of all the nodes Rudder is monitoringlast check etc.
If I then drill down further an select the Docker servers
Immediatly I can get a report in realtime on where my compliance is on this device and if it is failing, a log as to what is failing and why.
The Inventory tab gives a complete breakdown of software and hardware, and all of this inventory information can be accessed by the Technique Editor so rules can be created against the information
Also technical logs are visible for ther servers
Again, using the API from what i'm reading you could pull this data into Grafana using the API if needed.
Everything i've gone through with rudder in this post is free out of the box. If you were to venture into paid land, then you start to open up the plugins for the product
As well as the usual suspects like logo customization you start to get into areas like API Rights management, Workflow validation, scalability and HA setups
However for me the most useful plugin is the Security Benchmarking one
The CIS Benchmarks of the Center for Internet Security gives all the best practices in cybersecurity. Its technical orientation allows it to be quickly applied to systems.
This plugin provides a CIS rules packages that you can apply to all or part of your infrastructure. These rules can be customised to meet the particularities of your information system.
Supported on Debian 9, Ubuntu 18.04 LTS, RHEL and CentOS7
This takes the product into the realm of CVE testing of all the nodes
Which tests the packages installed on the servers against the CVE Database and flags possible security issues immediatly which can be reported on.
The documentation page is one of the better ones i've come across for a project such as this, the instructions are clear and well written.
Tutorial videos are also on youtube
Again, yes there are many ways to skin this cat, with existing automation tools, monitoring tools and probably many others. thats not the point, the point is it is good to have different systems auditing what you are deploying by automation tools, just as you'd get a professional auditor to do your finance audit not leave it to your CFO.
There is much more to this product that i've written here, more than enough potentially for a follow up post.