One of the things I’m fortunate to have access to at MTI Technology is the Solution Centre which has all sorts of kit that can be used for demos and for consultants to play around with.
After coming back from VMworld, one of the things I really wanted to test out was how easy it would be to deploy VSAN 6.1 in a ROBO solution. Fortunately I had a pair of old Dell R810s lying around and managed to cobble together enough disks and a pair of SSDs in order to create two VSAN nodes!
VSAN ROBO allows you to deploy a 2-node VSAN cluster (rather than the standard 3-nodes) with a Witness Server located on another site – usually this would be your primary data centre (as per diagram below). It also allows several ROBO deployments to be managed from a single vCenter Server. VSAN ROBO uses the same concepts as VSAN Stretched Cluster, using Fault Domains to determine how data is distributed across the VSAN nodes. The Witness Server is uniquely designed with the sole purpose of providing cluster quorum services during failure events and to store witness objects and cluster metadata information and in so doing eliminates the requirement of the 3rd physical VSAN node.
Note: Whenever you deploy any VMware product into a production environment, make sure that you check the Hardware Compatibility List!
In my case for VSAN, neither the server nor the storage controller in the R810 was supported – but as it was only a demo environment it wasn’t of top priority.
Before I go through how I configured VSAN ROBO, there are a few things I need to state upfront which I don’t recommend you doing in a production environment:
- Using the same subnet for the VSAN network – in my demo environment I only have 1 subnet, so I’ve had to stick everything on the same VLAN. Ideally you should separate out the VSAN traffic away from the Mgmt and VM traffic.
- Using a SSD from a desktop PC for the Cache drive – ideally this should be an enterprise grade SSD as VSAN uses the SSD for caching so you really need one that has a higher endurance rate.
Also there are a few features that are not supported in the ROBO solution (but available in standard VSAN):
- SMP-FT support
- Max value for NumberOfFailuresToTolerate is 1
- Limit of 3 for the number of Fault Domains (2 physical nodes and the witness server).
- All Flash VSAN.
- VSAN ROBO licensing is purchased in packs of 25 VMs, with 1 license per site. This means a maximum of 25 VMs can be licensed per site! However, 1 pack can be used across multiple ROBO sites (so 25 VMs across 5 sites).
From a configuration perspective, configuring a VSAN Cluster for ROBO is extremely simple as it is performed through a wizard within the vSphere Web Client. From a network perspective, the two VSAN Cluster nodes are to be configured over a single layer 2 network with multicast enabled. There are a few requirements for the Witness Server:
- 1.5 Mbps connectivity between nodes and witness
- 500 milliseconds latency RTT
- Layer 3 network connectivity without multicast to the nodes in the cluster
So for my demo environment, I have 2x R810s with 1x Intel Xeon X6550 and 32GB RAM. For my SSD I found an old 240GB Micron M500 SSD (MLC NAND flash) and stuck it into a Dell HD caddy, for my HDDs I have 5x 146GB SAS drives. The Witness server resides within my main VMware environment (which runs on UCS blades and a VNX5200).
I won’t go into how I installed vSphere ESXi 6.0 u1….. however, just remember that you’ll need to install ESXi onto an SD or USB drive as you want to use all local drives for VSAN (in my case I installed ESXi onto a 8GB USB drive).
I created a new VMware cluster within my vCenter and added the 2 VSAN nodes. I then deployed the Witness Server, which in my case was the nested ESXi host within a virtual appliance. There’s actually 3 sizes for the Witness Appliance – Tiny, Medium, Large. I deployed a Medium appliance.
Once it’s deployed and configured with the relevant IP address and hostname, you can add the Witness server into your vCenter Server as just another ESXi host.
One thing that’s slightly different is the Witness Server comes with its own vSphere license and so doesn’t consume one of your own licenses. Note that the license key is censored so you can’t use it elsewhere!
This occurs because the nestled ESXi host does not have any VMFS datastores configured, the warning can be ignored, but if you’re like me and hate the exclamation mark warnings in your environment you can easily get rid of the warning by adding a small 2GB disk to the witness appliance VM (Editing the Hardware settings) and then adding a datastore on top of the new disk.
You should be able to notice that the icon for the witness appliance within the vCenter Server inventory is slightly different from your physical hosts – it’s shaded light blue to differentiate it from standard ESXi hosts.
The next step is to configure the VSAN network on the witness server. There is already a portgroup pre-defined called witnessPg. Do note remove this port group as it has special modifications to make the MAC addresses on the network adapters match the nested ESXi MAC addresses!
There should be a VMkernel port already configured in the portgroup, edit the port and tag it for VSAN traffic.
At this point, ensure that your witness server can talk to the VSAN nodes.
Note: Typically an ESXi host has a default TCP/IP stack and as a result only has a single default gateway – more often than not, this default route is associated with the management network TCP/IP stack. In a normal deployment scenario, the VLAN for the management network would be isolated from the VSAN network, as such there is no path between the two networks and no default gateway on the VSAN network. A way around this problem is to use static routes to define a routing entry which indicates which path should be used for traffic between the witness server and the VSAN nodes. I won’t go into configuring static routes, you can find more detailed information in the VSAN 6.1 Stretched Cluster Guide.
Once your witness server is talking to the VSAN nodes, it’s time to configure the VSAN ROBO solution. This is as simple as creating fault domains.
I won’t go into how to turn on the VSAN cluster and disk management as this is simple stuff and has been covered off in numerous other VSAN blogs/guides. One thing I will mention is that because I have 2 very old servers, I had to configure each individual disk as a RAID-0 set as the RAID controller in the server did not support pass-through. Once configured and detected by the ESXi host as storage devices, I then had to manual set the SSD device as a Flash Disk:
I also ended up manually claiming the disks for VSAN.
Once the 2 nodes have been configured for VSAN, next comes the creation of the Fault Domains. As previously mentioned, VSAN ROBO works by creating 2 Fault Domains and a witness server – just like you would for a VSAN stretched cluster. However, in this case only 1 server is assigned to each fault domain.
Note: You probably have noticed that the wizard still states “VSAN Stretched Cluster” on all the screens, unfortunately VMware didn’t write separate code for VSAN ROBO, so it’s still classed as a stretched cluster.
Once VSAN ROBO has been deployed you can check the health of the VSAN by selecting the cluster and Monitor->Health.
The first warning is regarding the VSAN HCL, and shows that my server and its RAID controllers are not listed in VMwares’ VSAN HCL. =)
There is already a default VSAN storage Policy, creating a VM and assigning this policy gives a Failure To Tolerate of 1. Viewing the Physical Disk Placement you can see that data is mirrored on the 2 VSAN nodes with metadata stored on the Witness Server.
Something I found very useful was the “Proactive Tests” option for VSAN which provides the ability to perform a real time test of cluster functionalities and dependencies – creating a small VM, checking network multicast between hosts plus storage IO.
Voila…. a basic VSAN ROBO deployment…..
Don’t forget to download the Storage Management Pack for vROps so you can get an in-depth view of your VSAN deployment from within vROps: