Recovering the VCSA on a vSAN cluster

Disclaimer: The credit for the answer goes to John Nicholson (http://thenicholson.com/) a.k.a. lost_signal from the VMware SABU and I added some points.

As I am going through my physical design decisions, I came across a simple question for which I couldn’t find an immediate answer:

How can I restore my vCenter instance (VCSA) if I put in on the very same cluster it is supposed to manage? Can I restore directly on vSAN via an ESXi host?

As my google-Fu let me down, it was time to start a discussion on reddit:

vSAN question: Restore VCSA on vSAN from vmware

 

TL,DR: The good news is: Yes, you can recovery it directly and with 6.6. vSAN clusters this is straightforward with no prerequisites. Look into the vSAN Multicast Removal-guide for the post-processing steps.

As there are other aspects you generally need to consider (not only for vSAN),  I decided to summarize some basic points  (for 6.6 and onward clusters):

  • First things first, make a backup of your VCSA on a regular schedule along with your recovery objectives.
    • If you are on vSAN you should look for SPBM support in your selected product: the good if you have support, the bad if you don’t have it
  • Create ephemeral port groups as recovery options for the VCSA and vSAN portgroups
    • This is not vSAN specific but should be generally considered when you have the vCenter on the same vDS it manages
  • Make a backup of your vDS on a regular basis (or at least after changes)
  • Export your storage policies
    • Either for fallback in case you make accidental changes or for reference/auditing purposes
    • You might need them in case you are ever forced to rebuild the vCenter from scratch
  • John pointed out that a backup product with “boot from backup” capability (e.g. Veeam Instant restore) doesn’t need raise the initial question at all as an additional (NFS) datastore is mounted.
    • A point from myself: Verify the impact of NIOC settings if you followed the recommended shares in the vSAN guide for the vDS. The NFS mount uses the management network-VMK interface which is quite restricted (note: that this would only apply if you have bandwidth congestion anyway).

I would be more than happy if anyone is willing to contribute to this.

VCSA 6.5 U1: vAPI status “yellow” and content library not started (possible fix)

As this is an error that affected me now at multiple customer installations, it is time for a blog 🙂

After upgrading a site to 6.5 U1 I noticed several issues:

  • The vAPI Endpoint status changed to “yellow”
  • The Content Library service would not start

The only resolve I found within the VMware KB was “restart the services.

As I didn’t help, I searched along and found VMware KB 2151085 with the cause of the error

 

The ts-config.properties file is deployed with the noreplace option. With this option, the ts-config.properties will no longer be overwritten, instead it is saved with the extension .rpmnew.
and a nice hint
This is a known issue seen with several upgrade paths to vSphere 6.5 Update 1. Not all upgrade paths are affected, VMware is investigating affected paths this article will be updated once confirmed.
I hope this helps someone else as the KB entry isn’t obvious.

Adding a second syslog server to a VCSA 6.5 (Appliance MUI)

Beware: This is probably not supported

I was asked if I could add another syslog server to an existing VCSA deployment. With the nice blog post from William Lam in mind, adding the second server should be easy. Just edit the configuration and there you are.

Except:
The UI won’t allow this.

I guess it is CLI time then, luckily the blog post mentions this:

A minor change, but syslog-ng is no longer being used within the VCSA and has been replaced by rsyslog.

So we are just looking at a matter of finding the right config file.

In the main file

  • /etc/rsyslog.conf

you can an “include” statement point towards the file

  • /etc/vmware-syslog/syslog.conf

The only content is our first syslog server, configured as a remote syslog target.

At this point adding the second server is not a big deal, the file now looks like this:

Remember to reload the syslog service afterwards.

 

Another gotacha (besides the lack of support):

Changing the settings via VAMI/UI will overwrite your modifications