Ganeti: Segregate VMs from Running on the Same Hardware
Link: https://dannyman.toldme.com/2017/02/16/ganeti-exclusion-tags/
We have been using this great VM management software called Ganeti. It was developed at Google and I love it for the following reasons:
- It is at essence a collection of discrete, well-documented, command-line utilities
- It manages your VM infrastructure for you, in terms of what VMs to place where
- Killer feature: your VMs can all run as network-based RAID1s, the disk is mirrored on two nodes for rapid migration and failover without the need of an expensive, highly-available filer
- Good tech support via the mailing list
It is frustrating that relatively few people know about and use Ganeti, especially in the Silicon Valley.
Recently I had an itch to scratch. At the recent Ganeti Conference I heard tell that one could use tags to tell Ganeti to keep instances from running on the same node. This is another excellent feature: if you have two or more web servers, for example, you don’t want them to end up getting migrated to the same hardware.
Unfortunately, the documentation is a little obtuse, so I posted to the ganeti mailing list, and got the clues lined up.
First, you set a cluster exclusion tag, like so:
sudo gnt-cluster add-tags htools:iextags:role
This says “set up an exclusion tag, called role”
Then, when you create your instances, you add, for example: --tags role:prod-www
The instances created with the tag role:prod-www will be segregated onto different hardware nodes.
I did some testing to figure this out. First, as a control, create a bunch of small test instances:
sudo gnt-instance add ... ganeti-test0 sudo gnt-instance add ... ganeti-test1 sudo gnt-instance add ... ganeti-test2 sudo gnt-instance add ... ganeti-test3 sudo gnt-instance add ... ganeti-test4
Results:
$ sudo gnt-instance list | grep ganeti-test ganeti-test0 kvm snf-image+default ganeti06-29 running 1.0G ganeti-test1 kvm snf-image+default ganeti06-29 running 1.0G ganeti-test2 kvm snf-image+default ganeti06-09 running 1.0G ganeti-test3 kvm snf-image+default ganeti06-32 running 1.0G ganeti-test4 kvm snf-image+default ganeti06-24 running 1.0G
As expected, some overlap in service nodes.
Next, delete the test instances, set a cluster exclusion tag for “role” and re-create the instances:
sudo gnt-cluster add-tags htools:iextags:role
sudo gnt-instance add ... --tags role:ganeti-test ganeti-test0 sudo gnt-instance add ... --tags role:ganeti-test ganeti-test1 sudo gnt-instance add ... --tags role:ganeti-test ganeti-test2 sudo gnt-instance add ... --tags role:ganeti-test ganeti-test3 sudo gnt-instance add ... --tags role:ganeti-test ganeti-test4
Results?
$ sudo gnt-instance list | grep ganeti-test ganeti-test0 kvm snf-image+default ganeti06-29 running 1.0G ganeti-test1 kvm snf-image+default ganeti06-09 running 1.0G ganeti-test2 kvm snf-image+default ganeti06-32 running 1.0G ganeti-test3 kvm snf-image+default ganeti06-24 running 1.0G ganeti-test4 kvm snf-image+default ganeti06-23 running 1.0G
Yay! The instances are allocated to five distinct nodes!
But am I sure I understand what I am doing? Nuke the instances and try another example: 2x “www” instances and 3x “app” instances:
sudo gnt-instance add ... --tags role:prod-www ganeti-test0 sudo gnt-instance add ... --tags role:prod-www ganeti-test1 sudo gnt-instance add ... --tags role:prod-app ganeti-test2 sudo gnt-instance add ... --tags role:prod-app ganeti-test3 sudo gnt-instance add ... --tags role:prod-app ganeti-test4 What do we get?
$ sudo gnt-instance list | grep ganeti-test ganeti-test0 kvm snf-image+default ganeti06-29 running 1.0G # prod-www ganeti-test1 kvm snf-image+default ganeti06-09 running 1.0G # prod-www ganeti-test2 kvm snf-image+default ganeti06-29 running 1.0G # prod-app ganeti-test3 kvm snf-image+default ganeti06-32 running 1.0G # prod-app ganeti-test4 kvm snf-image+default ganeti06-24 running 1.0G # prod-app
Yes! The first two instances are allocated to different nodes, then when the tag changes to prod-app, ganeti goes back to ganeti06-29 to allocate an instance.