How to configure ElasticSearch snapshots

These are a few notes on how to backup and restore ElasticSearch indices. It worked for me, but I’m not an ES expert by any means, so if there’s a better way or something horribly wrong, let me know!

Mount the Shared Storage

The most basic form of snapshots, without using plugins for S3 or other distributed filesystems, uses a shared filesystem mounted on all nodes of the cluster. In my specific case this was a NFS share

# yum -y install nfs-utils rpcbind
# systemctl enable rpcbind
# vim /etc/fstab
... add your mountpoint ...
# mkdir /nfs
# mount /nfs

Set the Shared Storage as repository for ES

After the mount point is configured, you need to set it as a repository in the ES configuration, edit /etc/elasticsearch/elasticsearch.yml and add the key:

path.repo: /nfs

You’ll need to restart each node after the change and wait for it to rejoin the cluster.

Create the snapshot repository in ES

When the configuration is ok you should be able to create your snapshot repository. I made this script based on the official documentation:

#!/bin/bash

repo_name="backup"
repo_location="backup-weekly"

/usr/bin/curl -XPUT "http://localhost:9200/_snapshot/${repo_name}?pretty" -H 'Content-Type: application/json' -d"
{
  \"type\": \"fs\",
  \"settings\": {
    \"location\": \"${repo_location}\"
  }
}
"

This would save the snapshot in /nfs/backup-weekly/ and the repository name would be backup.

Create your first snapshot

Now you should be able to create your first snapshot. I created another script that takes one daily snapshot each day of the week. Please note: make sure the path to date is correct, or the command will fail and the snapshot_name will be empty, thus deleting the repository instead of the snapshot!

#!/bin/bash

repo_name="backup"
snapshot_name=$(LC_ALL=C /usr/bin/date +%A|tr '[:upper:]' '[:lower:]')

target="vip-es"

# delete the old snapshot (if any)
echo $(date) DELETE the old snapshot: $snapshot_name >> /var/log/es-backup.log
/usr/bin/curl -XDELETE "http://${target}:9200/_snapshot/${repo_name}/${snapshot_name}" >> /var/log/es-backup.log

echo $(date) CREATE the new snapshot: $snapshot_name >> /var/log/es-backup.log
/usr/bin/curl -XPUT "http://${target}:9200/_snapshot/${repo_name}/${snapshot_name}?wait_for_completion=true&pretty" >> /var/log/es-backup.log

The output should be something like:

ven 9 feb 2018, 01.31.01, CET DELETE the old snapshot: friday
{"error":{"root_cause":[{"type":"snapshot_missing_exception","reason":"[backup:friday] is missing"}],"type":"snapshot_missing_exception","reason":"[backup:friday] is missing"},"status":404}
ven 9 feb 2018, 01.31.01, CET CREATE the new snapshot: friday
{
  "snapshot" : {
    "snapshot" : "friday",
    "uuid" : "12345679-20212223",
    "version_id" : 6010199,
    "version" : "6.1.1",
    "indices" : [
      "test_configuration",
      ".kibana"
    ],
    "state" : "SUCCESS",
    "start_time" : "2018-02-09T00:31:01.586Z",
    "start_time_in_millis" : 1518136261586,
    "end_time" : "2018-02-09T00:31:04.362Z",
    "end_time_in_millis" : 1518136264362,
    "duration_in_millis" : 2776,
    "failures" : [ ],
    "shards" : {
      "total" : 25,
      "failed" : 0,
      "successful" : 25
    }
  }
}

In this case the DELETE failed because I didn’t have a previous snapshot for the current day.

List the available snapshots

To operate on the snapshots I made another script to list them by name, using jq. You’ll need to install it first (on CentOS 7: yum -y --enablerepo=epel install jq).

#!/bin/bash

repo_name="backup"

/usr/bin/curl -sS "http://localhost:9200/_snapshot/${repo_name}/_all" | jq '.snapshots[] | .snapshot,.end_time'

The output is just a list of snapshot names and their timestamps:

# bash list_snapshots.sh
"wednesday"
"2018-02-07T02:36:05.564Z"
"thursday"
"2018-02-08T02:37:10.403Z"
"friday"
"2018-02-09T02:31:04.362Z"

Restore a snapshot

No backup can be considered “good” without testing a restore from it. So I made another script to test how the restore would work on a separate test environment:

#!/bin/bash

repo_name='prod'
snap_name='wednesday'

for index_name in $(/usr/bin/curl -sS http://localhost:9200/_aliases | /usr/bin/jq 'keys | .[]' | sed -s "s/\"//g" ); do
    /usr/bin/curl -XPOST "http://localhost:9200/${index_name}/_close"
done

/usr/bin/curl -XPOST "http://localhost:9200/_snapshot/${repo_name}/${snap_name}/_restore?pretty"

I’m pretty sure there must be a better way to do this: what I’m doing is getting all the current indices and closing them all one by one (because you can’t restore an index that is currently open), then restoring the snapshot I copied over from the other environment.

It’s pretty horrible, but it works, if you know a better way let me know and I’ll change it, if you don’t… well, it works :)

References

Advertisements