CloudHealth by VMware – Reporting & Policies

Well, this blog post has been 3 months in the works – in fact a draft has been sitting in WordPress since September when I collaborated with Kim Bottu on his article about Integrating CloudHealth and vROps, the plan was for me to write a companion blog which show-cased the same capabilities in CloudHealth that he mentioned in his vROps blog.

2020 has been a profoundly difficult and odd year for everyone, and I found myself not wanting to do anything after a busy day of zoom meetings and home schooling. The motivation to write a blog just wasn’t there, and after a busy day all I just wanted to do was chill and relax in the evenings.
There’s been a fine-line in everyone’s work-life balance this year, and everyone needs to find that little bit of time each day to just shut off and unwind (usually when the kids are in bed)!

Anyways, the Christmas holidays and having time off work has given me the opportunity to sit back down and finish the blog (plus Kim was saying I should publish it in order to help my vExpert application for 2021… hahahahha… lol…. – btw, you have till the 9th January 2021 to submit!)

What is CloudHealth?

I guess the best place to start this blog is to give a quick overview of what CloudHealth actually is, so here’s the elevator pitch I always give….

“The more organisations invest in public cloud, the more important it is to have a cloud management strategy for their success, and this is where CloudHealth can assist.
CloudHealth is a multi-cloud management platform designed to provide full visibility into your cloud environment – helping you to identify opportunities for cost savings and usage optimisation. We help you to easily analyse and control cloud costs, security, performance and governance all from one single platform.
We give you insight into your data centre, hybrid and public cloud spend – aligning costs and usage to users, lines of business or even projects and business initiatives.We help make cloud management simple.”

Sooooo, what does that actually mean I hear you ask!?!

In a nutshell, CloudHealth takes your cloud billing and usage data, process and presents it in reports that help you visualise your costs and usage. In addition, one of their USPs is the ability to create perspectives to help you categorise and filter your data.

Currently CloudHealth supports Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), Oracle Cloud Infrastructure (OCI) and on-premise VMware environments. They also have a beta-program for VMware Cloud on AWS support.

CloudHealth is the clear leader in multi-cloud management, they’re the largest player in the market with 10,000+ customers and 230+ partners globally, managing over $11+bn in annual cloud spend.
CloudHealth has continued to be named a Leader in The Forrester Wave: Cloud Cost Management and Optimization Report.

What is a Perspective and How are They Used in Reports?

The most common way to describe a CloudHealth Perspective is that they are “lenses” through which you want to view your infrastructure. Each role within an organisation measures and evaluates the business from different viewpoints or ‘Perspectives’.
You can create Perspectives to view and group cloud assets together in order to align them with business objectives.
They provide a framework for categorising all the assets within your cloud infrastructure. For example, you could create a Perspective to group assets into Environment, Application, Department, Function, Project, or even Cost Centre.
You can build Perspectives dynamically using cloud tags or statically using the search capabilities.

For Example, the default view of a Cost History Report within CloudHealth is to show 13 months of cost data categorised by Service type (this is an example of an AWS report):

We can then take that default view and change the categorisation to a Perspective built to show Owners (this could help identify those users who spend all the company’s money on cloud!):

Or we can even change the view to categorise by a Perspective built to show Environment (IT Operation Managers are constantly looking for ways to show how much different Infrastructure Environments cost the business):

Finally, we can combine a number of Perspectives together to drill down further into our costs. In this example we’re filtering to look at just the Production Environment Perspective group, and categorising by the Owner Perspective (so helping to identify who spends the most in Production!):

Chart Types for a report can also be changed from Bar to Line – in this example we’re looking at the Cost History Report categorised by the Perspective ‘Line of Business’:

Another great Chart Type to use is the Pie Chart – as this is only 2 dimensional you will need to filter to a specific time period (eg. November 2020) and change the X-axis away from time interval (in this example I’ve used the ‘Line of Business’ Perspective):

Using CloudHealth to Generate Alerts.

Now the basics of Reporting and Perspectives are out of the way…. Let’s take a look at replicating within CloudHealth what Kim configured in vROps.

In Kim’s blog, he looked at how vROps can be configured to generate alerts based on Month to Date Cloud Spend for certain assets.
We’ll take look at how the Policy Engine works in CloudHealth to generate Alerts, and the actions that can be taken by a Policy.

Policies at its most basic is a set of rules that allow you to govern various aspects of your cloud infrastructure, such as cost, availability, security, performance, and usage.
The Policy Engine in CloudHealth is pretty powerful, it’s not just used to track cloud spend, for example:

  • you can track the launch of new resources
  • you can identify and terminate unused or underutilized assets
  • you can track unexpected cost spikes
  • you can track changes across the cloud infrastructure
  • you can identify resources that have been created out of compliance with specific rules (ie region location, OS type, etc)

At the core of each policy is a rule, which monitors for one or more conditions and, optionally, responds with an action. Actions could be to send an email to notify that a policy has been triggered, or to power off an EC2 instance or VM.

Creating a Policy to alert on Month to Date (MTD) Cloud Spend

One of the most common policies created by CloudHealth customers is a policy to identify increasing cloud costs over a set time period. When overall costs in your cloud environment increase suddenly, it could be an indicator of a larger problem – for example, a compromised cloud account where attackers have spun up a large number of EC2 instances and VMs.

You can create a policy that alerts someone via email whenever the Total Cost of your cloud bill increases by more than a certain percentage:

Or even by a fixed amount:

You even have the granularity to set the conditions to focus on a single Account (in this example ‘Test account name’):

Whilst these examples have a time interval of 1 day, this can be changed to 1 week or 1 month to suit your requirements.

Most Policies allow you to filter the rule condition to focus on a specific account (eg. Test Account name), a specific service/asset type (eg. EC2 Compute), a specific Region, or even by a Perspective you’ve created (eg. Environment = Production):

Alternatively, you can create a policy for a specific resource type you may want to focus on, in the following example we’re just looking at EC2 Instances and want to be alerted if the total costs increased by 10% over 1 month, we could then take a number of different actions – email, delete EC2 instance, stop EC2 instance, etc:

CloudHealth vs vRealize Operations

Having used both CloudHealth and vROps, I would say it’s far easier to create reports, policies and alerts within CloudHealth compared to vROps – but I might be a little biased here… =)

The Cost and Usage reports are far better in CloudHealth – the added feature of being able to use filters, categorisations and Perspectives to change the viewpoint of the report visualisation is something that stands us apart from other tools! Not to mention that changing the visualisation occurs instantly, there’s no need to wait for processing to occur to rebuild the graphical data.
Within CloudHealth you also have far greater granularity to customise the policy conditions by using the filter capabilities.

One thing I constantly get asked is whether CloudHealth and vRealize overlap each other and perform the same functions.
They’re actually complementary management solutions as they are two different products providing information for different use cases within an organisation!

vRealize offers operational efficiency and automation and CloudHealth brings collaboration, governance, and optimization. 

  • vRealize focuses on driving efficient operations (i.e., provisioning, troubleshooting, capacity planning, automation) in the private and hybrid clouds. Providing Consistent infrastructure and operations, from the data center to the cloud.
  • CloudHealth focuses on driving improved business outcomes (i.e., governance, optimization, visibility, chargeback) in the public and hybrid clouds. Breaking down public cloud silos and streamline cost, compliance and analytics operations.

It’s also worth noting that the starting point for the journey to multi-cloud can originate in the enterprise data centre or from the public cloud. Whether an enterprise is looking to expand its data centre to public or vice versa.

In the data centre, infrastructure/operation teams require tools for configuration, provisioning, automation, capacity planning and governance for all their data centre assets (ie Day 2 operations). It’s also very Capex-intensive and costs are somewhat stable and predictable. This is the perfect scenario for vRealize.

In the public and multi-cloud world, developers and lines-of-business users provision resources directly themselves. It’s very Opex-intensive and resource usage can be dynamic and unpredictable. The management disciplines needed for cloud-centric, de-centralized IT include ways to govern usage, optimise costs and deal with cloud security threats and vulnerabilities. This is where CloudHealth comes into the fore.

For example, vRealize can be used to help perform capacity planning assessments and ‘What If’ scenario modelling. CloudHealth can be used to model the cost of migrations from private to public cloud.

Anyways, I’ve realised that this has been a super long post so I’m going to end here. I hope it’s been useful reading…. I’m also hoping that I’ll get the chance to blog more often on CloudHealth and its features in the coming year! =)

For now, I hope you all have a Happy New Year! Let’s pray that 2021 will bring back some normality to the world!

MTI Secure Hyper-Converged Infrastructure Webinar & Guide

Back end of February I presented a webinar with my colleague, Andrew Tang, around Key Challenges and Considerations for Securing Hyper-Converged Infrastructure.

The webinar has been uploaded for public consumption by the marketing team at MTI Technology.

As I mentioned previously in my blog, I don’t really touch upon product in this webinar as the last thing customers want is to be shoehorned into a certain vendor product… instead I hope the webinar gives enough information about what HCI is in general, why customers should be looking at HCI during their next infrastructure refresh, and more importantly what to consider when evaluating a HCI solution!

You can access the webinar recording here: https://mti.com/secure-hci-webinar-page/ (sorry, you have to fill in your details to gain access….)

Marketing has also finally released the HCI guide that both Andrew and myself put together around HCI, feel free to download that here: https://bit.ly/2qMY6qJ

Finally, if you’re interested in talking more about HCI then feel free to contact me or register for one of MTI’s HCI Discovery Workshops: https://bit.ly/2vQO3Gb

Spectre & Meltdown Update

So it seems that the microcode patches released by VMware associated with their recent Security Advisory (VMSA-2018-0004) have been pulled….
https://kb.vmware.com/s/article/52345
So that’s ESXi650-201801402-BG, ESXi600-201801402-BG, or ESXi550-201801401-BG.

The microcode patch provided by Intel was buggy and there seems to be issues when VMs access the new speculative execution control mechanism (Haswell & Broadwell processors). However, I can’t seem to find much around what these issues are…

For the time being, if you haven’t applied one of those microcode patches, VMware recommends not doing so and to apply the patches listed in VMSA-2018-0002 instead.

If you have applied the latest patches you will have to edit the config files of each ESXi host and add in a line that hides the new speculative execution control mechanism and reboot the VMs on that host. Detailed information can be found in the KB above.

 

Finally William Lam has created a very handy PowerCLI script that will help provide information about your existing vSphere environment and help identify whether you have hosts that are impacted by Spectre and this new Intel Sighting issue: https://www.virtuallyghetto.com/2018/01/verify-hypervisor-assisted-guest-mitigation-spectre-patches-using-powercli.html

vCenter Server Migration Tool: vSphere 6.0 Update 2m

Last year I blogged about the vCS to vCSA converter tool that VMware Labs released as a fling and how I had used it to pretty much convert all my lab vCenters (all bar one) to vCSAs….. since then I’ve been following the releases and a few months ago I noticed the Fling was deprecated (ie you can’t download it). I didn’t think much of it as VMworld 2016 was only round the corner, so thought it might be rolled into an impending vSphere/vCenter release….. unfortunately that never quite materialised in Las Vegas, and rumours are that vSphere 6.5 might be released in Barcelona.

So I was quietly surprised when I got an email notification from VMware Blogs to inform me that a new minor update of vSphere had been released specifically for migration puposes – vSphere 6.0 Update 2m (where the ‘m’ stands for migration).

vSphere 6.0 Update 2m is an automated end to end migration tool from a Windows vCenter Server 5.5 (any update) to a vCenter Server Appliance 6.0 Update 2 (so pretty much what the Fling used to achieve).

It’s common knowledge that trying to migrate from a Windows vCenter Server (with a SQL backend) to a vCenter Server Appliance was not an easy task – in fact in 90% of my customers I’ve just told them to start a fresh rather than go through the pain of scripting a migration. However, I’m so glad that VMware have rolled out the Converter fling into an actual production release – now we have an end-to-end migration tool which takes all the pain out of the equation!

Those of you who are interested in migrating from your Windows vCenter Server 5.5 (any update) to a vCenter Server Appliance 6.0 Update 2 should download and use this release. The vSphere 6.0 Update 2m download is an ISO consisting of the Migration Tool and vCenter Server Appliance 6.0 Update 2, roughly about 2.8GB in size.

Note: you cannot use this release to deploy a new installation of vCSA! To do that you just use the vCSA 6.0 Update 2 install.

What’s Supported:

  • Previous versions of Windows vCenter Server will need to upgraded to vCenter Server 5.5 prior to migration.
  • The best thing is that all database types currently supported with vCenter Server 5.5 will be migrated to the embedded vPostgres database in the vCSA!
  • It’s worth noting that if VMware Update Manager is installed on the same server as the Windows vCenter Server 5.5, it will need to be moved to an external server prior to starting the migration process.
  • VMware and 3rd party extension registrations are migrated, but may need to be re-registered.
  • vCenter Server 5.5 both Simple and Custom deployment types are supported.
  • Configuration, inventory, and alarm data will be migrated automatically, historical and performance data (stats, tasks, events) is optional.
  • If the source was a Simple vCenter Server 5.5 install (so SSO + vCS) then it will be migrated to a vCSA with embedded PSC.
  • If the source was a Custom vCenter Server 5.5 install (so separate SSO and vCS) then it will be migrated to a vCSA with external PSC.

Somethings that are worth mentioning prior to starting a migration are:

  • It preserves the personality of the Windows vCenter Server which includes but not limited to IP Address, FQDN, UUID, Certificates, MoRef IDs.
  • Changing of your deployment topology during the migration process is not allowed. For example, if your vSphere 5.5 Windows vCenter was deployed using the Simple deployment option, then your Windows vCenter Server 5.5 will become an embedded vCenter Server Appliance 6.0.
  • During the migration process the source Windows vCenter Server will be shutdown, plan accordingly for downtime.
  • The migration tool will also be performing an upgrade, standard compatibility and interoperability checks will still apply. Please use the interoperability matrix to make sure all VMware solutions are compatible with vSphere 6.0. Also talk to your 3rd solution vendors to make sure those solutions are also compatible with vSphere 6.0.

 

The only annoying thing is that because I’ve used the fling previously to convert all my Windows vCenter Servers, I now don’t have anything I can test this migration tool on!! >_<”

I’m currently in the process of digging out an old vCenter Server 5.5 ISO so that I can deploy it and upgrade it using the new release!

 

Anyways, those of you who haven’t yet upgraded to vCenter Server 6.0 and to an appliance, now there’s no reason why you can’t as you have a fully supported tool from VMware!

Best of all, they’re in the process of improving the migration tool so that it can be used to migrate from a Windows vCenter Server 6.0 install to a vCenter Server Appliance 6.0. One feature I hope they will also include is the ability to migrate from an existing vCSA to another vCSA.

vCenter Server 6.0 Update 2m links:

 

Installing vShield Endpoint (vCNS Mgr 5.5.4-3)

Very quick blog entry as I’m busy tying up loose ends before jetting off on my summer hols….

It’s pretty easy to install vShield Endpoint as it’s a wizard-based OVA deployment. I’m not going to step through the process as it’s very simple (plus the install guide explains it very well). Once that’s done log into the console and run ‘setup’ to configure the IP address and DNS information.

After that, it’s a case of logging into vShield Manager and connecting to vCenter Server.

Once connected to the vCenter, you should see your datacenter and hosts in a hierarchical tree on the left menu. Select each host and installed vShield Endpoint.

vShield Installation guide: http://www.vmware.com/pdf/vshield_55_install.pdf

However, I did encounter a few issues (due to prior deployments which hadn’t been cleaned up properly).

Error 1: VMKernel Portgroup present on incorrect vSwitchvcns1
This occurred because the hosts had a previous vSwitch labelled vmservice-vswitch, but the VMkernel port vmservice-vmknic-pg resided on a different vSwitch (previous deployment). To correct this I had to delete the old VMkernel port and recreate it on the correct vmservice-vswitch.

Error 2: VirtualMachine Portgroup present on incorrect vSwitch

vcns2Again this was due to a mis-configuration on a previous deployment! What should happen is once you’ve setup the vmservice-vswitch and created the vmservice-vmknic-pg portgroup and VMkernel port, the installer will create a new portgroup on that vSwitch called vmservice-vshield-pg. Like before, this was residing on the wrong vSwitch.

In the end I just deleted the wrong vSwitch and started again by creating the vmservice-vswitch and the vmservice-vmknic-pg. After that the installation of vShield Endpoint went swimmingly!

vcns3

Which goes to show that cleaning up an old deployment within your demo environment can sometimes be very handy! =)

 

Known bug with upgrading vCSA via VAMI

So there’s a known bug where upgrading vCSA via the VAMI freezes at 70%…. I was doing a mass upgrade of all my vCSAs in the demo environment at work, and all of them got stuck at 70%.

vcsa

After reading the Release Notes for 6.0U1b, it turns out it’s a known issue: http://pubs.vmware.com/Release_Notes/en/vsphere/60/vsphere-vcenter-server-60u1b-release-notes.html

New In the vCenter Server Appliance Management Interface, the vCenter Server Appliance update status might be stuck at 70%
In the vCenter Server Appliance Management Interface, the vCenter Server Appliance update status might be stuck at 70%, although the update is successful in the back end. You can check the update status in the /var/log/vmware/applmgmt/software-packages.log file. After a successful update, a message similar to the following is seen in the log file:
Packages upgraded successfully, Reboot is required to complete the installation

Workaround: None.

Anyways, after checking the software-packages.log, I could see the packages upgraded successfully entry so just rebooted the vCSA. All up and working again!

vcsa2

If you want steps on how to upgrade your vCSA, then have a look at my previous blog entry: Upgrading vCenter Server Appliance to 6.0 update 1

Upgrading vRealize Operations to 6.2

Now that vRealize Ops 6.2 has been released, it’s time to upgrade your Ops Manager virtual appliance. So how do you do that? Well, it’s pretty simple actually!

Nearly all of VMware’s virtual appliances have a simple upgrade process where you download an upgrade PAK file and upload it to the admin page of the appliance – and once uploaded it’s just a simple “click and install”….!

  1. First up, download the 6.2 upgrade PAK file from the My VMware Portal. You will required TWO upgrade PAK files, one to upgrade the vApps OS, the other to upgrade the vROps product.
    vrop01
    For an OS upgrade, the file is: vRealize_Operations_Manager-VA-OS-xxx.pak
    For the product upgrade of virtual appliance clusters, the file is: vRealize_Operations_Manager-VA-xxx.pak
  2. Before starting the upgrade it’s probably best to either take a backup or a snapshot of your entire vRealize Operations cluster as a precaution.
    Note: The cluster can be online or offline when running the upgrade.
    Log into the master node administrator interface via your web browser:
    https://<master-node-FQDN-or-IP-address>/admin
  3. On the left navigation menu, click Software Update. Note the version that vROps is currently at (for me it was 6.1). Click Install a Software Update.
    vrop02
  4. Firstly perform the OS upgrade. This updates the OS on the virtual appliance and restarts each virtual machine. Follow the wizard to locate and install the OS PAK file.
    vrop04
    Note: If you have customised the content that vROps provides – such as alerts, symptoms, recommendations, and policies – and you want to install content updates, a best practice is to clone the content before performing the upgrade. You can then select the option to reset out-of-the-box content when you install the software update, and the update will provide new content without overwriting any customised content.
    vrop03
  5. Click Upload to stage the upgrade files.
    vrop05
  6. Once upload has completed, a summary of what the PAK file contains is listed. Click Next and accept the EULA, then click Finish to start the upgrade process.
    vrop06
  7. Once the upgrade is complete, vROps will restart and you need to log back into the admin page. Navigate to Software Update and you will see a message stating what previous software update was installed.
    vrop07
  8. Now repeat the upload and installation process for the Product upgrade PAK file.
    vrop08
  9. Once again, vROps will reboot after the Product upgrade PAK file has been installed. Log back in and navigate to Software Update, you should now see that vROps has been upgraded.
    vrop09

 

There you go… nice and simple!

If you encounter any issues, then head over to the vROps 6.2 Release Notes: http://pubs.vmware.com/Release_Notes/en/vrops/62/vrops-62-release-notes.html

Deploying VSAN 6.1 ROBO

One of the things I’m fortunate to have access to at MTI Technology is the Solution Centre which has all sorts of kit that can be used for demos and for consultants to play around with.

After coming back from VMworld, one of the things I really wanted to test out was how easy it would be to deploy VSAN 6.1 in a ROBO solution. Fortunately I had a pair of old Dell R810s lying around and managed to cobble together enough disks and a pair of SSDs in order to create two VSAN nodes!

VSAN ROBO allows you to deploy a 2-node VSAN cluster (rather than the standard 3-nodes) with a Witness Server located on another site – usually this would be your primary data centre (as per diagram below). It also allows several ROBO deployments to be managed from a single vCenter Server. VSAN ROBO uses the same concepts as VSAN Stretched Cluster, using Fault Domains to determine how data is distributed across the VSAN nodes. The Witness Server is uniquely designed with the sole purpose of providing cluster quorum services during failure events and to store witness objects and cluster metadata information and in so doing eliminates the requirement of the 3rd physical VSAN node.

vsan-robo-wit

Note: Whenever you deploy any VMware product into a production environment, make sure that you check the Hardware Compatibility List!
In my case for VSAN, neither the server nor the storage controller in the R810 was supported – but as it was only a demo environment it wasn’t of top priority.

Before I go through how I configured VSAN ROBO, there are a few things I need to state upfront which I don’t recommend you doing in a production environment:

  1. Using the same subnet for the VSAN network – in my demo environment I only have 1 subnet, so I’ve had to stick everything on the same VLAN. Ideally you should separate out the VSAN traffic away from the Mgmt and VM traffic.
  2. Using a SSD from a desktop PC for the Cache drive – ideally this should be an enterprise grade SSD as VSAN uses the SSD for caching so you really need one that has a higher endurance rate.

Also there are a few features that are not supported in the ROBO solution (but available in standard VSAN):

  • SMP-FT support
  • Max value for NumberOfFailuresToTolerate is 1
  • Limit of 3 for the number of Fault Domains (2 physical nodes and the witness server).
  • All Flash VSAN.
  • VSAN ROBO licensing is purchased in packs of 25 VMs, with 1 license per site. This means a maximum of 25 VMs can be licensed per site! However, 1 pack can be used across multiple ROBO sites (so 25 VMs across 5 sites).

From a configuration perspective, configuring a VSAN Cluster for ROBO is extremely simple as it is performed through a wizard within the vSphere Web Client. From a network perspective, the two VSAN Cluster nodes are to be configured over a single layer 2 network with multicast enabled. There are a few requirements for the Witness Server:

  • 1.5 Mbps connectivity between nodes and witness
  • 500 milliseconds latency RTT
  • Layer 3 network connectivity without multicast to the nodes in the cluster

 

So for my demo environment, I have 2x R810s with 1x Intel Xeon X6550 and 32GB RAM. For my SSD I found an old 240GB Micron M500 SSD (MLC NAND flash) and stuck it into a Dell HD caddy, for my HDDs I have 5x 146GB SAS drives. The Witness server resides within my main VMware environment (which runs on UCS blades and a VNX5200).

I won’t go into how I installed vSphere ESXi 6.0 u1….. however, just remember that you’ll need to install ESXi onto an SD or USB drive as you want to use all local drives for VSAN (in my case I installed ESXi onto a 8GB USB drive).

I created a new VMware cluster within my vCenter and added the 2 VSAN nodes. I then deployed the Witness Server, which in my case was the nested ESXi host within a virtual appliance. There’s actually 3 sizes for the Witness Appliance – Tiny, Medium, Large. I deployed a Medium appliance. vsan1a vsan2vsan3

I won’t step through how to deploy the OVA as it’s pretty routine stuff. If you load up the console for the Witness server, you’ll be greeted with the familiar DCUI of vSphere ESXi.
vsan4

Once it’s deployed and configured with the relevant IP address and hostname, you can add the Witness server into your vCenter Server as just another ESXi host.

vsan5 vsan6

One thing that’s slightly different is the Witness Server comes with its own vSphere license and so doesn’t consume one of your own licenses. Note that the license key is censored so you can’t use it elsewhere!
vsan7

Once the Witness Server has been added to the vCenter Server you may find that there is a warning on the host which says “No datastores have been configured”
vsan8

This occurs because the nestled ESXi host does not have any VMFS datastores configured, the warning can be ignored, but if you’re like me and hate the exclamation mark warnings in your environment you can easily get rid of the warning by adding a small 2GB disk to the witness appliance VM (Editing the Hardware settings) and then adding a datastore on top of the new disk.
vsan9

You should be able to notice that the icon for the witness appliance within the vCenter Server inventory is slightly different from your physical hosts – it’s shaded light blue to differentiate it from standard ESXi hosts.
vsan10

The next step is to configure the VSAN network on the witness server. There is already a portgroup pre-defined called witnessPgDo note remove this port group as it has special modifications to make the MAC addresses on the network adapters match the nested ESXi MAC addresses!
There should be a VMkernel port already configured in the portgroup, edit the port and tag it for VSAN traffic.
vsan11 vsan12

At this point, ensure that your witness server can talk to the VSAN nodes.

Note: Typically an ESXi host has a default TCP/IP stack and as a result only has a single default gateway – more often than not, this default route is associated with the management network TCP/IP stack. In a normal deployment scenario, the VLAN for the management network would be isolated from the VSAN network, as such there is no path between the two networks and no default gateway on the VSAN network. A way around this problem is to use static routes to define a routing entry which indicates which path should be used for traffic between the witness server and the VSAN nodes. I won’t go into configuring static routes, you can find more detailed information in the VSAN 6.1 Stretched Cluster Guide.

Once your witness server is talking to the VSAN nodes, it’s time to configure the VSAN ROBO solution. This is as simple as creating fault domains.

I won’t go into how to turn on the VSAN cluster and disk management as this is simple stuff and has been covered off in numerous other VSAN blogs/guides. One thing I will mention is that because I have 2 very old servers, I had to configure each individual disk as a RAID-0 set as the RAID controller in the server did not support pass-through. Once configured and detected by the ESXi host as storage devices, I then had to manual set the SSD device as a Flash Disk:
vsan13

I also ended up manually claiming the disks for VSAN.

vsan14

Once the 2 nodes have been configured for VSAN, next comes the creation of the Fault Domains. As previously mentioned, VSAN ROBO works by creating 2 Fault Domains and a witness server – just like you would for a VSAN stretched cluster. However, in this case only 1 server is assigned to each fault domain.

vsan15 vsan16 vsan17 vsan18 vsan19

Note: You probably have noticed that the wizard still states “VSAN Stretched Cluster” on all the screens, unfortunately VMware didn’t write separate code for VSAN ROBO, so it’s still classed as a stretched cluster.

Once VSAN ROBO has been deployed you can check the health of the VSAN by selecting the cluster and Monitor->Health.
vsan20 vsan20a
The first warning is regarding the VSAN HCL, and shows that my server and its RAID controllers are not listed in VMwares’ VSAN HCL. =)

Next license the VSAN ROBO cluster, note what features get switched off when licensing for VSAN ROBO.
vsan21 vsan22

There is already a default VSAN storage Policy, creating a VM and assigning this policy gives a Failure To Tolerate of 1. Viewing the Physical Disk Placement you can see that data is mirrored on the 2 VSAN nodes with metadata stored on the Witness Server.
vsan23

Something I found very useful was the “Proactive Tests” option for VSAN which provides the ability to perform a real time test of cluster functionalities and dependencies – creating a small VM, checking network multicast between hosts plus storage IO.

vsan24

 

 

Voila…. a basic VSAN ROBO deployment…..

Don’t forget to download the Storage Management Pack for vROps so you can get an in-depth view of your VSAN deployment from within vROps:
https://solutionexchange.vmware.com/store/products/vrealize-operations-management-pack-for-storage-devices

Unable to connect to VAMI after upgrading the vCSA

One of the plus points with upgrading your vCenter Server Appliance to 6.0 update 1 is the fact that VMware have re-introduced the Virtual Appliance Management Interface (VAMI). This was one of my bug-bears with 6.0… how any sort of administration/configuration work required you to access the vCSA shell!

Recently after upgrading a customers vCSA from 6.0 to 6.0 update 1, we couldn’t access the VAMI to change the network and password policy settings. We rebooted the vCSA several times but still the VAMI was inaccessible, within Chrome we were getting the following error:

vami

I couldn’t work out why the VAMI services wasn’t coming online….. After several minutes of searching on Google, I came across the following VMware KB:
http://kb.vmware.com/kb/2132965

It turns out that there is a known bug with the VAMI web-service if you disable IPv6 within the vCSA console (which is what I had done as there was no requirement from the customer to use IPv6).

There is currently no resolution to this bug, and in order to solve the issue you have to edit the lighttpd configuration file.
(lighttpd is a light-weight open-source web server)

To workaround this issue set the server.use-ipv6 parameter to disable in the /etc/applmgmt/appliance/lighttpd.conf.
  1. Connect to the vCenter Appliance or Platform Service Controller Appliance through SSH or console.
  2. Run this command to enable access the Bash shell:
    shell.set –enabled true
  3. Type shell and press Enter.
  4. Open the lighttpd.conf file using a text editor:
    vi/etc/applmgmt/appliance/lighttpd.conf
    vami1
  5. Search for the entry server.use-ipv6=”enable”
  6. Change enable to disable.
    server.use-ipv6=”disable”
    vami2
  7. Start the VAMI service by running this command:
    service vami-lighttp start
  8. You should now be able to access the VAMI from a browser (https://vCSA_IP_address:5480 or https://vCSA_FQDN:5480).