Test Driven Infrastructure with Goss

I was looking into tools to help write Ansible playbooks and ended up on the Molecule website. The project looks really interesting, but what catched my attention at first was the option to test your playbooks by using one of 3 frameworks: Goss, Serverspec or Testinfra.

Goss is a very young project but at the same time it has 3 features that got me interested:

  • It’s tiny and compact: one go static binary without external dependencies.
  • It’s fast.
  • It seems very easy to use without sacrificing too much power.

I had a few basic checks created years ago for my servers, so I set myself out to see how much work it would take to port them from Bash (ewww, I know.) to Goss.

I ended up rewriting most of them in less than 3 hours, and now:

  • they are way easier to read
  • I added a lot of functionality
  • and it only took 3 hours even if I was dealing with a few quirks and some trial and error required to bridge the gap where the documentation was too sparse
    (but it could be better when you read this: I sent a pull request to the project with some improvements to the documentation)

Excited yet? Let’s see how to get started. First thing, you’ll want goss.

$ curl -o goss https://github.com/aelsabbahy/goss/releases/download/v0.2.4/goss-linux-amd64
$ chmod +x goss
$ ./goss --help
NAME:
   goss - Quick and Easy server validation
[...]

The README on the official website has a very nice “45 seconds introduction” that I recommend to check out quickly if you want to get an idea of what Goss can do.

I’ll start a bit slower and talk you through some considerations I made after a few hours working with it.

Goss files are YAML or JSON files describing the tests you want to run to validate your system. Goss has a cool autoadd feature that automatically creates a few predefined tests for a given resource, let’s start from this:

# ./goss -g httpd.yaml autoadd httpd
Adding Package to 'httpd.yaml':
httpd:
  installed: true
  versions:
  - 2.2.15

Adding Process to 'httpd.yaml':
httpd:
  running: true

Adding Port to 'httpd.yaml':
tcp6:80:
  listening: true
  ip:
  - '::'

Adding Service to 'httpd.yaml':
httpd:
  enabled: true
  running: true


# cat httpd.yaml
package:
  httpd:
    installed: true
    versions:
    - 2.2.15
port:
  tcp6:80:
    listening: true
    ip:
    - '::'
service:
  httpd:
    enabled: true
    running: true
process:
  httpd:
    running: true

So, we already have a barebone test suite to make sure our webserver is up and running: it will check that the package is installed, and the current version, it’ll make sure that something is listening on all addresses (tcp6 ::) on port 80, it’ll make sure that the httpd service is enabled at boot time and is currently running and that the httpd process is currently listd in the process list.

Please note that “service running” will get the data from upstart/systemd while “process running” will actually check the process list: if the httpd process is running but the service is not, then something went wrong!

Let’s try to run our basic test suite:

# ./goss -g httpd.yaml validate --format documentation
Process: httpd: running: matches expectation: [true]
Port: tcp6:80: listening: matches expectation: [true]
Port: tcp6:80: ip: matches expectation: [["::"]]
Package: httpd: installed: matches expectation: [true]
Package: httpd: version: matches expectation: [["2.2.15"]]
Service: httpd: enabled: matches expectation: [true]
Service: httpd: running: matches expectation: [true]

Total Duration: 0.015s
Count: 14, Failed: 0, Skipped: 0

All green! Our tests passed.

Now let’s say we want to make sure that we want httpd to run a ServerLimit of 200 clients. Goss allows us to check a file content using powerful regular expressions, for example we can add to our httpd.yaml file:

file:
  /etc/httpd/conf/httpd.conf:
    exists: true
    contains:
    - "/^ServerLimit\\s+200$/"

We’re saying that we want a line starting with ServerLimit, followed by some spaces or tabs and then 200 at the end of the line. Let’s run our suite again and see if it works:

# ./goss -g httpd.yaml validate --format documentation
File: /etc/httpd/conf/httpd.conf: exists: matches expectation: [true]
File: /etc/httpd/conf/httpd.conf: contains: matches expectation: [/^ServerLimit\s+200$/]
[...]
Count: 18, Failed: 0, Skipped: 0

All green again! Our server looks in good shape. Let’s add another check, this time we want to make sure the DocumentRoot directory exists. We add another check to the list:

file:
  /etc/httpd/conf/httpd.conf:
    exists: true

service:
  httpd:
    enabled: true
    running: true

file:
  /var/www/html:
    filetype: directory
    exists: true

But if we run this suite we’ll notice that our previous check on httpd.conf doesn’t run anymore. The reason why this happens is that the goss file describes a nested data structure, so the second file entry will overwrite the first, and you’ll end up scratching your head, wondering why your first test hasn’t been run.
In JSON would have been more obvious:

{
  "file": {
    "/etc/httpd/conf/httpd.conf": {
      "exists": true
    }
  },

  "service": {
    "httpd": {
      "enabled": true,
      "running": true
    }
  },

  "file": {
    "/var/www/html": {
      "filetype": "directory",
      "exists": true
    }
  }
}

See how the second file entry overwrites the first one? Keep that in mind!

Since you’ll probably want to keep your tests in different files, let’s talk quickly about how to manage that. For example, let’s create a new file to monitor a fileserver mount:

# ./goss -g fileserver.yaml add mount /mnt/nfs
Adding Mount to 'fileserver.yaml'
[...]
# cat fileserver.yaml
mount:
  /mnt/nfs:
    exists: true
    opts:
    - rw
    - nodev
    - noexec
    - relatime
    source: vip-nfs.stardata.lan:/data/nfs
    filesystem: nfs

If we want to check both fileserver.yaml and httpd.yaml at the same time, we’ll need to use the gossfile directive creating a new file that includes the other two:

# ./goss -g all.yaml add goss httpd.yaml
# ./goss -g all.yaml add goss fileserver.yaml
# cat all.yaml
gossfile:
  fileserver.yaml: {}
  httpd.yaml: {}

# ./goss -g all.yaml validate
.............

Total Duration: 0.016s
Count: 13, Failed: 0, Skipped: 0

If we want to get a single file containing all the tests, we can use the render command:

# ./goss -g all.yaml render
file:
  /etc/httpd/conf/httpd.conf:
    exists: true
    contains:
    - /^ServerLimit\s+200$/
package:
  httpd:
    installed: true
    versions:
    - 2.2.15
port:
  tcp6:80:
    listening: true
    ip:
    - '::'
service:
  httpd:
    enabled: true
    running: true
process:
  httpd:
    running: true
mount:
  /mnt/nfs:
    exists: true
    opts:
    - rw
    - nodev
    - noexec
    - relatime
    source: vip-nfs.stardata.lan:/data/nfs
    filesystem: nfs

This way we can easily distribute the test suite since it’s a single file.

I hope to have sparked some interest in this tool. It’s still very basic, for example it doesn’t support variables or loops yet, but it’s great to start writing quickly some tests to make sure your servers are configured and working as intended!

Advertisements

Stuff I’m gonna look into: Graphite and collectd

In the eternal struggle against crappy monitoring and alerting systems, my next step will be to check collectd and Graphite.

The post that sparked my interest is: Graphite alerts with Monit. I will probably avoid Monit since it’s not flexible enough for what I have in mind, but it’s a good read.

Why Collectd? Because I read that it works with Graphite and already provides lots of stats (CPU and mem usage, MySQL server stats, etc).

Why Graphite? Because it provides a very easy way to store metrics. Basically, all you need to do is send a plain text message to Carbon (a component of Graphite) with the metric name, a numerical value and a UNIX epoch timestamp:

PORT=2003
SERVER=graphite.your.org
echo "local.random.diceroll 4 `date +%s`" | nc ${SERVER} ${PORT};

Collect your jaw from the floor and keep reading: not only Graphite provides a web interface that allows you to easily chart your metrics, you can even retrieve the data for arbitrary timesets in structured formats! Wanna see the data for a specific metric in the last few minutes? Send a request like:

http://graphite-server/render?target=your.metric.name&from=-5m&format=json

And you’ll get the data in json format:

[{
  "your.metric.name": "entries",
  "datapoints": [
    [1.0, 1311836008],
    [2.0, 1311836009],
    [3.0, 1311836010],
    [5.0, 1311836011],
    [6.0, 1311836012]
  ]
}]

It even works with wildcard tokens, so if all your mysql servers metrics are stored with the mysql word in their name, you can run chart or retrieve the data asking for:

http://graphite-server/render?target=your.metric.*.mysql.*&from=-5m&format=json

Graphite also provides powerful aggregation functions that you can use on your data.

If you, like me, are now interested in Graphite and Collectd, check these links: