How to cleanup and shrink disk space usage of a Windows KVM virtual machine

We still need Windows VMs (sadly, for a few tools we’re trying to get rid of), and my VM grew so much that the image was up to 60Gb. With my laptop only having a 256Gb SSD, it was getting pretty crowded. So I set up to cleanup the Windows image and shrink it down as much as possible, and I managed to get it down to 13Gb.

Since I’m not very familiar with Windows, I leveraged the knowledge of the Internet and started cleaning my system using the tips from this article: I ran CCleaner, removed old files, uninstalled unused software. Then I went on to the “not obvious” ways to free space. I opened an administrator console and proceeded to remove shadow copies:

vssadmin delete shadows /for=c: /all

and I consolidated the Service Pack on disk, to get rid of a lot of backups from C:\windows\winsxs\:

dism /online /cleanup-image /spsuperseded

there’s a few more things you can do to save space in that directory, especially if you run Windows 8.1, Server 2012 or newer, it’s worth checking this Microsoft Technet article.

Once I cleaned up as much space as possible, I ran the Windows Defrag utility to cluster up the remaining data and then went on to fill the rest of the disk with zeroes. Think of it like doing dd if=/dev/zero of=/zero.img: you’re creating a file containing only zeroes, so that those clusters will result “empty” during the shrinking.

On Windows, the recommended tool to zero-fill your disk seems to be SDelete. I ran it as administrator in a cmd console:

sdelete -z c:

This took a long time. Hours. Best thing would probably have been to run it overnight: learn from my mistakes! :)

Note: if you have a thin disk (for example a qcow2 image), filling it up with zeroes will actually consume space on the host, up to the maximum size of the virtual disk. In my case, the image grew from a bit more than 60G to 200G. A necessary, and temporary, sacrifice.

ls -l /var/lib/libvirt/images/
[...]
-rw-r--r-- 1 root root 200G 31 dic 16.34 win7_orig.img

After SDelete finished running (and syncing to disk), I shut down the VM and prepared for the next step: shrinking the actual disk image. Thankfully, qemu-img allows you to convert to the same format. This will discard any empty cluster (remember? we filled them with zeroes, so they are empty!).

In my case, I ran two processes in parallel, because I wanted to see how much of a difference it would make to have a compressed image versus a non-compressed image, as suggested by this Proxmox wiki page:

cd /var/lib/libvirt/images/
qemu-img convert -O qcow2 win7_nocomp.img win7_orig.img &
qemu-img convert -O qcow2 -c win7_compress.img win7_orig.img &
watch ls -l

This process didn’t take too long, less than one hour, and the result was pretty interesting:

ls -l /var/lib/libvirt/images/
[...]
-rw-r--r-- 1 root root  13G  1 gen 18.13 win7_compress.img
-rw-r--r-- 1 root root  31G 31 dic 19.09 win7_nocomp.img
-rw-r--r-- 1 root root 200G 31 dic 16.34 win7_orig.img

The compressed image is less than half the non-compressed one, but you’ll use a bit more CPU when using it. In my case this is completely acceptable, because saving disk space is more important.

How to install and use SPICE for VMs in Debian, Ubuntu or Mint

SPICE is a suite of tools for interfacing with desktop-oriented Virtual Machines. I’ve been using it for a couple of years, on Fedora and CentOS systems, mostly for Windows VMs that I required for work.

Until recently, it was fairly complicated to get SPICE to work on Debian-based systems, but I’ve just installed and got it working on Mint. Thankfully, nowadays you don’t need to recompile anything. All the patches and support are included by default, and you need to install these packages:

# apt-get update
# apt-get install virt-manager libvirt-daemon python-spice-client-gtk qemu-kvm-spice virt-viewer spice-vdagent qemu-utils  gir1.2-spice-client-gtk-3.0 gir1.2-spice-client-gtk-2.0 gir1.2-spice-client-glib-2.0

After this, I just created a new VM with virt-manager and it had SPICE enabled by default.

For more information, I recommend checking:

RHCSA – Use grep and regular expressions to analyze text

Queste sono alcune note sull’utilizzo di grep basate sulla pagina man e sull’esperienza personale. I test sono stati fatti su una Scientific Linux 6 (prima ancora che uscisse CentOS 6).

Sintassi di base (la riga che inizia per # è un commento):

# grep pattern file(s)
grep '127.0.0.1' /etc/*
/etc/hosts:127.0.0.1 localhost localhost.localdomain
[...]

Di default, grep assume stdin come input file, quindi è possibile usarlo in cascata sull’output di altri comandi:

# comando1 | grep pattern
ifconfig | grep 'inet addr'
  inet addr:127.0.0.1  Mask:255.0.0.0

Con l’opzione -e è possibile specificare uno o più pattern:

# grep -e pattern1 -e pattern2 file(s)
grep -e Linux -e '127.0.0.1' /etc/*
/etc/hosts:127.0.0.1   localhost localhost.localdomain
[...]
/etc/redhat-release:Scientific Linux release 6.0 (Carbon)

Con l’opzione -i si ignorano maiuscole e minuscole (rallenta di molto l’esecuzione):

# creo un file contenente due righe, la prima contiene 'Foo'
echo 'Foo' > casetest.txt
# la seconda contiene 'foo' (minuscolo)
echo 'foo' >> casetest.txt
# con un grep normale, usando 'foo' (minuscolo) come pattern ottengo
grep foo casetest.txt
foo
# usando l'opzione -i invece
grep -i foo casetest.txt
Foo
foo

Con l’opzione -v si inverte l’output (stampa le righe che non corrispondono al pattern):

# usiamo il file di prima, aggiungiamo 'Bar'
echo Bar >> casetest.txt
# grep -v pattern file(s)
grep -v foo casetest.txt
# attenzione: non abbiamo usato -i, quindi 'Foo' viene trattato diversamente
# da 'foo' e, non facendo match sul pattern, viene stampato
Foo
Bar

Con -c, invece di scrivere le righe che corrispondono al pattern, grep stampa un conteggio di quante volte il pattern è contenuto in uno o più file:

# grep -c pattern file(s)
grep -c Linux /etc/*
/etc/dhcp:0
[...]
/etc/redhat-release:1

Con le opzioni -l e -L si ottengono delle liste di file che contengono o non contengono un pattern:

# grep -l pattern file(s)
grep -l Linux /etc/*
/etc/grub.conf
/etc/issue
/etc/issue.net
[...]

# grep -L pattern file(s)
grep -L Linux /etc/*
/etc/dhcp
/etc/yum.conf
[...]

L’opzione -H dice di stampare sempre il nome del file di fianco all’eventuale match. E’ il comportamento di default quando ci sono molti files (vedere esempi precedenti).

L’opzione -n attiva la scrittura del numero di riga di fianco ad ogni match:

# grep -n pattern file(s)
grep -n Linux /etc/*
/etc/grub.conf:14:title Scientific Linux (2.6.32-71.18.2.el6.x86_64)
[...]

Con l’opzione -Z grep userà un null byte come separatore dei file. Utile con -l o -L e in combinazione con altri comandi che possono usare in ingresso una lista di nomi di file separati da null, ad esempio xargs -0. Serve per gestire nomi di file contenenti spazi o altri caratteri speciali.

# grep -lZ pattern file(s) | comando -opzione_per_input_null
grep -lZ Linux /etc/* | xargs -0 ls -lh
lrwxrwxrwx. 1 root root   22  4 feb 17:04 /etc/grub.conf -> ../boot/grub/grub.conf
-rw-r--r--. 1 root root   58 24 feb 21:12 /etc/issue
-rw-r--r--. 1 root root   57 24 feb 21:12 /etc/issue.net
-rw-r--r--. 1 root root 1,9K 23 nov 19:53 /etc/mail.rc
-rw-r--r--. 1 root root 4,9K 23 nov 22:37 /etc/oddjobd.conf
lrwxrwxrwx. 1 root root   15  4 feb 17:00 /etc/rc.sysinit -> rc.d/rc.sysinit
-rw-r--r--. 1 root root   38 24 feb 21:12 /etc/redhat-release
-rw-r--r--. 1 root root 6,4K 24 nov 07:52 /etc/smartd.conf
-rw-r--r--. 1 root root  822 24 nov 23:50 /etc/sysctl.conf
lrwxrwxrwx. 1 root root   14 21 mar 05:00 /etc/system-release -> redhat-release

Le opzioni di contesto servono a specificare quante righe di contesto estrarre oltre alla riga che fa il match del pattern. Con -A si indicano le righe dopo (After), con -B quelle prima (Before) e con -C un numero di righe simmetrico sia prima che dopo (Context).

# stampa due righe prima e 5 dopo il match
grep -B2 -A5 Linux /etc/*
[...]
--
/etc/smartd.conf-# the 3w-xxxx driver. Start long self-tests Sundays between 1-2, 2-3, 3-4,
/etc/smartd.conf-# and 4-5 am.
/etc/smartd.conf:# NOTE: starting with the Linux 2.6 kernel series, the /dev/sdX interface
/etc/smartd.conf-# is DEPRECATED.  Use the /dev/tweN character device interface instead.
/etc/smartd.conf-# For example /dev/twe0, /dev/twe1, and so on.
/etc/smartd.conf-#/dev/sdc -d 3ware,0 -a -s L/../../7/01
/etc/smartd.conf-#/dev/sdc -d 3ware,1 -a -s L/../../7/02
/etc/smartd.conf-#/dev/sdc -d 3ware,2 -a -s L/../../7/03
--
[...]

L’opzione -r (o -R) attiva la ricerca ricorsiva nei percorsi indicati:

# grep -r pattern path(s)
grep -r Linux /etc/*
[...]

L’opzione -z permette di gestire liste di file separate da byte null (vedi -Z).

Pattern e regular expressions

I pattern da utilizzare con grep possono essere semplici stringhe, oppure espressioni regolari che permettono di avere il match su più stringhe.

# grep -e pattern1 -e pattern2 file(s)
grep -e 'foo' -e 'Foo' casetest.txt
Foo
foo
grep -e '[Ff]oo' casetest.txt
Foo
foo
grep -e '.oo' casetest.txt
Foo
foo

Abbiamo visto due sintassi basate su regular expression: nella prima si utilizza una bracket expression, cioè una lista di caratteri contenuti tra parentesi quadre. Nella seconda, invece, abbiamo utilizzato il punto (.) che farà il match con qualsiasi carattere.

Nel caso delle bracket expression, potremo specificare tutte le lettere (maiuscole e minuscole) che potranno soddisfare l’espressione, oppure invertire il controllo usando il circonflesso come primo carattere della bracket expression:

echo aoo >> casetest.txt
grep -e '[^abc]' casetest.txt
Foo
foo
grep -e '.oo' casetest.txt
Foo
foo
aoo

Attenzione: se il circonflesso appare a inizio riga, ma non all’interno di parentesi quadre, indica che l’espressione deve cominciare a inizio riga.

echo snafooz >> casetest.txt
grep -e '^foo' casetest.txt
foo
grep -e 'foo' casetest.txt
foo
snafooz

L’opposto del circonflesso è il dollaro, per trovare stringhe a fine riga:

grep -e 'foo$' casetest.txt
foo
grep -e 'foo' casetest.txt
foo
snafooz

In questo caso, foo è l’unica stringa della riga, quindi è sia a inizio che a fine riga, per verificarlo possiamo unire i controlli:

grep -e '^foo$' casetest.txt
foo

Tornando alle bracket expression, è possibile utilizzare un segno meno (-) per creare un range, per esempio potremmo voler fare il match di tutti i numeri:

seq 100 | grep -e '[0-9]7'
17
27
37
47
57
67
77
87
97

In questo caso seq ha generato la lista dei numeri da 1 a 100. L’espressione regolare diceva di fare il match solo di quelli che contenevano un numero da 0 a 9 seguito dal numero 7. Di default seq ha generato i numeri da 1 a 10 senza mettere uno zero prima del numero, quindi il 7 è stato saltato da grep, mentre tutti gli altri soddisfano l’espressione regolare e sono stati riportati nell’output.

Esistono anche espressioni speciali che indicano particolari classi di caratteri:

  • [:alpha:] indica tutti i caratteri alfabetici (equivale al doppio range [A-Za-z], tranne per il fatto che quest’ultimo può essere influenzato dal “locale” impostato)
  • [:digit:] tutti i numeri (equivale al range [0-9])
  • [:lower:] e [:upper:] rispettivamente i caratteri lowercase ([a-z]) e uppercase ([A-Z])
  • [:space:] i caratteri vuoti (spazio, tab, a capo)

Per una lista completa si veda il manuale.

echo '1foo' | grep '[[:digit:]][[:alpha:]]'
1foo
echo '1foo' | grep '[[:digit:]][[:digit:]][[:alpha:]]'
echo '1foo' | grep '[[:digit:]][[:alpha:]][[:alpha:]]'
1foo

Attenzione: le parentesi quadre sono state raddoppiate perché quelle più esterne iniziano la bracket expression, mentre quelle interne fanno parte del simbolo che rappresenta la classe (es. [:digit:]).

Le espressioni regolari (singoli caratteri, bracket expression, carattere punto) possono essere seguite da operatori di ripetizione, che specificano quante volte si deve ripetere l’espressione che li precede.

Il primo operatore di ripetizione è il carattere punto interrogativo (?) che indica che l’espressione che lo precede dev’essere assente o presente una sola volta, ad esempio:

echo forever >> casetest.txt
grep -E '[Ff]?oo' casetest.txt
Foo
foo
aoo
snafooz

Come potete vedere, anche aoo fa il match, perché soddisfa le due o consecutive anche se non soddisfa la presenza (opzionale, grazie al punto interrogativo) della f. Da notare che abbiamo utilizzato l’opzione -E per attivare le regexp (regular expression) estese.

Il secondo operatore è l’asterisco (*), questo modifica l’espressione precedente indicando che può essere presente zero o più volte:

grep -E 'sn.*z' casetest.txt
snafooz

In questo caso abbiamo usato il carattere jolly . unito all’asterisco per specificare “qualsiasi carattere, ripetuto qualsiasi numero di volte”.

L’operatore più (+), invece, specifica che l’espressione compaia una o più volte:

grep -E 'o+' casetest.txt
Foo
foo
aoo
snafooz
forever

Come vedete, c’è stato il match sia di forever che di foo.

Esiste un altro operatore che permette di specificare in modo raffinato il numero di ripetizioni: tra parentesi graffe ({}) si può specificare il valore preciso {n} o un range {n,m} o un numero minimo {n,} o massimo {,m} di volte. Per esempio, proviamo a trovare tutti gli indirizzi IP in /etc/:

grep -E '([[:digit:]]+\.){3}[[:digit:]]+'
/etc/hosts:127.0.0.1   localhost localhost.localdomain
/etc/networks:default 0.0.0.0
/etc/networks:loopback 127.0.0.0
/etc/networks:link-local 169.254.0.0

Spieghiamo l’espressione regolare: cercherà uno o più numeri ([[:digit:]]+) seguiti da un punto (\.) e questo gruppo (raggruppato tra parentesi tonde) deve ripetersi tre volte ({3}) ed essere seguito da un altro gruppo di uno o più numeri.

È possibile anche creare alternative tra due espressioni, separandole con un pipe (|):

grep -E 'Foo|Bar' casetest.txt
Foo
Bar

Adaptec Series 6 – 6G SAS/PCIe 2 (rev 01) [9005:028b] tools

Posting this in hope to save time to some fellow admins out there.

I had this on a server:

# lspci | grep -i adaptec
05:00.0 RAID bus controller: Adaptec Series 6 - 6G SAS/PCIe 2 (rev 01)
# lspci -n | grep '05:00'
05:00.0 0104: 9005:028b (rev 01)

Seems impossible to see the model number, not even lshw reports it. Luckily, the tools are the same for all the controllers, you can find them going to the Microsemi Adaptec Series 6 support page, clicking any controller, clicking the Storage Manager download link and then the Microsemi Adaptec ARCCONF Command Line Utility.

This is the link to the Microsemi Adaptec ARCCONF Command Line Utility Download Page at the moment of this writing.

Once installed, the tool will be in /usr/Arcconf, I created a symlink in /usr/bin

# arcconf getconfig 1
Controllers found: 1
----------------------------------------------------------------------
Controller information
----------------------------------------------------------------------
   Controller Status                        : Optimal
   Channel description                      : SAS/SATA
   Controller Model                         : Adaptec 6405E
...

There’s more information about how to use the tools on these pages:

Test Driven Infrastructure with Goss

I was looking into tools to help write Ansible playbooks and ended up on the Molecule website. The project looks really interesting, but what catched my attention at first was the option to test your playbooks by using one of 3 frameworks: Goss, Serverspec or Testinfra.

Goss is a very young project but at the same time it has 3 features that got me interested:

  • It’s tiny and compact: one go static binary without external dependencies.
  • It’s fast.
  • It seems very easy to use without sacrificing too much power.

I had a few basic checks created years ago for my servers, so I set myself out to see how much work it would take to port them from Bash (ewww, I know.) to Goss.

I ended up rewriting most of them in less than 3 hours, and now:

  • they are way easier to read
  • I added a lot of functionality
  • and it only took 3 hours even if I was dealing with a few quirks and some trial and error required to bridge the gap where the documentation was too sparse
    (but it could be better when you read this: I sent a pull request to the project with some improvements to the documentation)

Excited yet? Let’s see how to get started. First thing, you’ll want goss.

$ curl -o goss https://github.com/aelsabbahy/goss/releases/download/v0.2.4/goss-linux-amd64
$ chmod +x goss
$ ./goss --help
NAME:
   goss - Quick and Easy server validation
[...]

The README on the official website has a very nice “45 seconds introduction” that I recommend to check out quickly if you want to get an idea of what Goss can do.

I’ll start a bit slower and talk you through some considerations I made after a few hours working with it.

Goss files are YAML or JSON files describing the tests you want to run to validate your system. Goss has a cool autoadd feature that automatically creates a few predefined tests for a given resource, let’s start from this:

# ./goss -g httpd.yaml autoadd httpd
Adding Package to 'httpd.yaml':
httpd:
  installed: true
  versions:
  - 2.2.15

Adding Process to 'httpd.yaml':
httpd:
  running: true

Adding Port to 'httpd.yaml':
tcp6:80:
  listening: true
  ip:
  - '::'

Adding Service to 'httpd.yaml':
httpd:
  enabled: true
  running: true


# cat httpd.yaml
package:
  httpd:
    installed: true
    versions:
    - 2.2.15
port:
  tcp6:80:
    listening: true
    ip:
    - '::'
service:
  httpd:
    enabled: true
    running: true
process:
  httpd:
    running: true

So, we already have a barebone test suite to make sure our webserver is up and running: it will check that the package is installed, and the current version, it’ll make sure that something is listening on all addresses (tcp6 ::) on port 80, it’ll make sure that the httpd service is enabled at boot time and is currently running and that the httpd process is currently listd in the process list.

Please note that “service running” will get the data from upstart/systemd while “process running” will actually check the process list: if the httpd process is running but the service is not, then something went wrong!

Let’s try to run our basic test suite:

# ./goss -g httpd.yaml validate --format documentation
Process: httpd: running: matches expectation: [true]
Port: tcp6:80: listening: matches expectation: [true]
Port: tcp6:80: ip: matches expectation: [["::"]]
Package: httpd: installed: matches expectation: [true]
Package: httpd: version: matches expectation: [["2.2.15"]]
Service: httpd: enabled: matches expectation: [true]
Service: httpd: running: matches expectation: [true]

Total Duration: 0.015s
Count: 14, Failed: 0, Skipped: 0

All green! Our tests passed.

Now let’s say we want to make sure that we want httpd to run a ServerLimit of 200 clients. Goss allows us to check a file content using powerful regular expressions, for example we can add to our httpd.yaml file:

file:
  /etc/httpd/conf/httpd.conf:
    exists: true
    contains:
    - "/^ServerLimit\\s+200$/"

We’re saying that we want a line starting with ServerLimit, followed by some spaces or tabs and then 200 at the end of the line. Let’s run our suite again and see if it works:

# ./goss -g httpd.yaml validate --format documentation
File: /etc/httpd/conf/httpd.conf: exists: matches expectation: [true]
File: /etc/httpd/conf/httpd.conf: contains: matches expectation: [/^ServerLimit\s+200$/]
[...]
Count: 18, Failed: 0, Skipped: 0

All green again! Our server looks in good shape. Let’s add another check, this time we want to make sure the DocumentRoot directory exists. We add another check to the list:

file:
  /etc/httpd/conf/httpd.conf:
    exists: true

service:
  httpd:
    enabled: true
    running: true

file:
  /var/www/html:
    filetype: directory
    exists: true

But if we run this suite we’ll notice that our previous check on httpd.conf doesn’t run anymore. The reason why this happens is that the goss file describes a nested data structure, so the second file entry will overwrite the first, and you’ll end up scratching your head, wondering why your first test hasn’t been run.
In JSON would have been more obvious:

{
  "file": {
    "/etc/httpd/conf/httpd.conf": {
      "exists": true
    }
  },

  "service": {
    "httpd": {
      "enabled": true,
      "running": true
    }
  },

  "file": {
    "/var/www/html": {
      "filetype": "directory",
      "exists": true
    }
  }
}

See how the second file entry overwrites the first one? Keep that in mind!

Since you’ll probably want to keep your tests in different files, let’s talk quickly about how to manage that. For example, let’s create a new file to monitor a fileserver mount:

# ./goss -g fileserver.yaml add mount /mnt/nfs
Adding Mount to 'fileserver.yaml'
[...]
# cat fileserver.yaml
mount:
  /mnt/nfs:
    exists: true
    opts:
    - rw
    - nodev
    - noexec
    - relatime
    source: vip-nfs.stardata.lan:/data/nfs
    filesystem: nfs

If we want to check both fileserver.yaml and httpd.yaml at the same time, we’ll need to use the gossfile directive creating a new file that includes the other two:

# ./goss -g all.yaml add goss httpd.yaml
# ./goss -g all.yaml add goss fileserver.yaml
# cat all.yaml
gossfile:
  fileserver.yaml: {}
  httpd.yaml: {}

# ./goss -g all.yaml validate
.............

Total Duration: 0.016s
Count: 13, Failed: 0, Skipped: 0

If we want to get a single file containing all the tests, we can use the render command:

# ./goss -g all.yaml render
file:
  /etc/httpd/conf/httpd.conf:
    exists: true
    contains:
    - /^ServerLimit\s+200$/
package:
  httpd:
    installed: true
    versions:
    - 2.2.15
port:
  tcp6:80:
    listening: true
    ip:
    - '::'
service:
  httpd:
    enabled: true
    running: true
process:
  httpd:
    running: true
mount:
  /mnt/nfs:
    exists: true
    opts:
    - rw
    - nodev
    - noexec
    - relatime
    source: vip-nfs.stardata.lan:/data/nfs
    filesystem: nfs

This way we can easily distribute the test suite since it’s a single file.

I hope to have sparked some interest in this tool. It’s still very basic, for example it doesn’t support variables or loops yet, but it’s great to start writing quickly some tests to make sure your servers are configured and working as intended!