How to find out the PCI address for a given ethernet device

This is something I was looking for since ages, because it could be useful for automating deployiments on bare metal.

When building bond interfaces, something you really want is to bond devices on different hardware paths, for example one embedded ethernet and one from a PCI lane.

On recent Linux systems (CentOS 6.x), you can use udevadm info to find out the PCI address for your interfaces:

# for E in eth{0..3} ; do echo $E ; udevadm info --query=path --path=/sys/class/net/${E} ; done
eth0
/devices/pci0000:00/0000:00:01.1/0000:02:00.0/net/eth0
eth1
/devices/pci0000:00/0000:00:01.1/0000:02:00.1/net/eth1
eth2
/devices/pci0000:00/0000:00:02.0/0000:04:00.0/net/eth2
eth3
/devices/pci0000:00/0000:00:02.0/0000:04:00.1/net/eth3

So on this system I want to bind eth0+2 and eth1+3.

For reference, I found the original info on this (outdated) Linux Questions post.

Advertisements

How to keep your XHgui (mongo) database in check

We got XHgui installed on a client’s VM. It’s profiling 1/100 requests, but the Mongo Database size increased rapidly and we started to look into how to keep it in check.

First thing is to gather some data about our database (and collection):

# mongo
MongoDB shell version: 2.6.5
connecting to: test
> show dbs
local   0.078GB
xhprof  3.952GB

> use xhprof
switched to db xhprof

> show collections
results
system.indexes

> db.results.find().count()
48417

We’re dealing with about 48k records, but many of those are useless, profiling of scripts that are already very fast and unworthy of the devs attention, so the first thing we wanted was to remove most of the “noise” from our collection.

The document has this format:

> db.results.findOne()
{
	"_id" : ObjectId("547ef32ec2f6cd9253ec1053"),
	"profile" : {
		[...]
		"main()==>{closure}" : {
			"ct" : NumberLong(1),
			"wt" : NumberLong(27),
			"cpu" : NumberLong(0),
			"mu" : NumberLong(2952),
			"pmu" : NumberLong(0)
		},
		"main()" : {
			"ct" : NumberLong(1),
			"wt" : NumberLong(247),
			"cpu" : NumberLong(0),
			"mu" : NumberLong(8232),
			"pmu" : NumberLong(4880)
		}
	},
	[...]
}

So we want to filter by main() wt (wall time):

> db.results.find().count()
48417
> db.results.find({"profile.main().wt" : {$lt: 1000000}}).count()
48129
> db.results.find({"profile.main().wt" : {$lt: 500000}}).count()
47185
> db.results.find({"profile.main().wt" : {$lt: 100000}}).count()
41694

As you see, the vast majority of the results are below 100 milliseconds of wall time. Since we’re not interested in that many results and we want to trim the database a bit, we made a backup and then performed a remove:

> db.results.remove({"profile.main().wt" : {$lt: 500000}})
WriteResult({ "nRemoved" : 47234 })

At this point, the database stats did look a lot more interesting:

> db.stats()
{
	"db" : "xhprof",
	"collections" : 3,
	"objects" : 1846,
	"avgObjSize" : 65844.85373781149,
	"dataSize" : 121549600,
	"storageSize" : 3583025152,
	"numExtents" : 24,
	"indexes" : 6,
	"indexSize" : 703136,
	"fileSize" : 4226809856,
	"nsSizeMB" : 16,
	"dataFileVersion" : {
		"major" : 4,
		"minor" : 5
	},
	"extentFreeList" : {
		"num" : 0,
		"totalSize" : 0
	},
	"ok" : 1
}

The dataSize parameter is the size of the actual data. The storageSize parameter is the space MongoDB allocated for the data, while the fileSize is the on-disk size of the database (including indexes, etc).

As you can learn from the MongoDB Storage FAQ, Mongo allocates space in chunks: the first chunk is 64Mb, the second 128Mb and so on up to 2Gb. Each file after that is 2Gb in size, as you can see in this output:

[root@ws ~]# ls -lh /var/lib/mongo/
totale 4,1G
drwxr-xr-x 2 mongod mongod 4,0K  3 dic 12:19 journal
-rw------- 1 mongod mongod  64M 23 nov 22:44 local.0
-rw------- 1 mongod mongod  16M 23 nov 22:44 local.ns
-rwxr-xr-x 1 mongod mongod    6 23 nov 22:44 mongod.lock
drwxr-xr-x 2 mongod mongod 4,0K 29 nov 23:12 _tmp
-rw------- 1 mongod mongod  64M  3 dic 12:38 xhprof.0
-rw------- 1 mongod mongod 128M  3 dic 12:38 xhprof.1
-rw------- 1 mongod mongod 256M  3 dic 12:38 xhprof.2
-rw------- 1 mongod mongod 512M  3 dic 12:38 xhprof.3
-rw------- 1 mongod mongod 1,0G  3 dic 12:38 xhprof.4
-rw------- 1 mongod mongod 2,0G  3 dic 12:38 xhprof.5
-rw------- 1 mongod mongod  16M  3 dic 12:37 xhprof.ns

To retrieve the space used by empty chunks, you’ll need to perform a db repair or drop the database alltogether and rebuild it from a backup.

WARNING: the db repair operation is BLOCKING and it requires double the amount of space taken by the files on disk (fileSize: in my case, about 4G).

After removing the records and performing the repair, we went down from 4G to about 200Mb:

> db.repairDatabase()
{ "ok" : 1 }
> db.stats()
{
	"db" : "xhprof",
	"collections" : 3,
	"objects" : 2079,
	"avgObjSize" : 65585.81625781626,
	"dataSize" : 136352912,
	"storageSize" : 167763968,
	"numExtents" : 13,
	"indexes" : 6,
	"indexSize" : 490560,
	"fileSize" : 201326592,
	"nsSizeMB" : 16,
	"dataFileVersion" : {
		"major" : 4,
		"minor" : 5
	},
	"extentFreeList" : {
		"num" : 0,
		"totalSize" : 0
	},
	"ok" : 1
}

A different approach would be to backup the database, drop it alltogether and restore it from the backup. WARNING: this one requires downtime, too!

# echo "db.results.remove({\"profile.main().wt\" : {\$lt: 500000}})" | mongo xhprof
# mongodump -d xhprof
# echo 'db.dropDatabase()' | mongo xhprof
# sync
# mongorestore dump/xhprof

Note: MongoDB has a compact function for collections that is used to remove data fragmentation (doc). This function does NOT free up disk space (the fileSize will stay the same):

> db.runCommand({ compact : 'results' })
{ "ok" : 1 }
> db.stats()
{
	"db" : "xhprof",
	"collections" : 3,
	"objects" : 2079,
	"avgObjSize" : 65585.81625781626,
	"dataSize" : 136352912,
	"storageSize" : 167759872,
	"numExtents" : 13,
	"indexes" : 6,
	"indexSize" : 490560,
	"fileSize" : 4226809856,
	"nsSizeMB" : 16,
	"dataFileVersion" : {
		"major" : 4,
		"minor" : 5
	},
	"extentFreeList" : {
		"num" : 24,
		"totalSize" : 3438862336
	},
	"ok" : 1
}