VCSA Storage Logs Full Overall Health

VMware vCenter Server Appliance /storage/log Full

A while back I was welcomed to the office by a vCenter Server Appliance critical health alert, specifically, ‘The /storage/log filesystem is out of disk space or inodes’. This error is usually due to a failed automated log clean-up process, so in this article I detail how to implement a temporary ‘get out of jail’ fix, followed by a more permanent fix with the identification of the offending files and how to tidy them up.

VCSA Storage Logs Full Overall Health

Firstly, let’s take a look at the file  system itself in order to confirm our UI findings. SSH onto the VCSA appliance and enter BASH, then list all available file systems via the df -h command. From the below screenshot the UI warning has been confirmed, specifically, the file system in question has been completely consumed.

VCSA Storage Logs Full Disk Consumed
Confirming the consumed file system.

The ‘Get Out of Jail’ Temporary Fix

In the unfortunate event that this issue is preventing you from accessing vCenter, we can implement a quick fix by extending the affected disk. Note, this is a quick fix only and should be implemented to restore vCenter access only. This should not be relied on as a permanent resolution.

As we have already identified the problematic disk, jump over to the vSphere client and extend the disk in question (you call by how much, but in my environment, I’ve added an additional 5 GB). This leaves us the final task of initiating the extension and enabling the VCSA to see the additional space. Depending on your VCSA version, there are two options:

VCSA v6.0
vpxd_servicecfg storage lvm autogrow
VCSA v6.5 and 6.7
/usr/lib/applmgmt/support/scripts/autogrow.sh

Lastly, list all file systems to confirm the extension has been realised.

VCSA Storage Logs Full Disk Available
The results of the extension…

Permanent Fix

So, we’re out of jail, but we still have an offending consumer. In my instance, checking within the file system identified a number of large log files. These hadn’t been cleared automatically by the VCSA so a manual intervention was required. Specifically, the removal of localhost_access_log, vmware-identity-sts, and vmware-identity-sts-perf logs was required. These can be removed via the below command.

rm log-file-name.*
VCSA Storage Logs - Purge Logs
Purging the offending logs…

Following the removal, another df -h show’s we’re back in business.

VCSA Storage Logs Full After Cleanup
…and the results of the purge.

Lastly, and in this instance, restart the Security Token Service to initiate the creation of new log files.

service vmware-stsd restart
VCSA Storage Logs - Restart vmware-stsd
Restart the Security Token Service to initiate the creation of new log files.

Further Reading

For this specific issue, please see VMware KB article 2143565, however, if in doubt, do call upon the VMware Support. The team will be able to assist you in identifying the offending files/directories which can be safely removed.

Upgrade VMware vCenter Server Appliance from 6.5 to 6.7

With the release of vSphere 6.7 back in April 2018, a host of new enhancements, features, and goodies had the vCommunity going wild. With enhanced feature parity between the legacy vSphere Web Client and new HTML 5 vSphere Client, as well as the vCenter Server Appliance boasting performance increases of ~2X faster performance in vCenter operations per second, ~3X reduction in memory usage, and ~3X faster DRS-related operations (e.g. power-on virtual machine); these two areas alone made most of us want to upgrade. Nice.

vSphere 6.7 also boasts the new Quick Boot feature for vSphere hosts running the ESXi 6.7 hypervisor and above. This feature allows users to a) reduce maintenance time by removing the number of reboots required during major version upgrades (Single Reboot), and b) allows users to restart the ESXi hypervisor without having to reboot the physical host (essentially skipping the time-consuming hardware initialisation). Very nice!

Continue reading → Upgrade VMware vCenter Server Appliance from 6.5 to 6.7

Reclaim VMFS Deleted Blocks via UNMAP

Reclaim VMFS Deleted Blocks via VAAI UNMAP

Since the release of vSphere 5.5 back in September 2013 we have been able to utilise ESXCLI to manually reclaim deleted blocks from VMFS datastores. Essentially, by using the VAAI UNMAP primitive, we can reclaim previously used blocks by releasing them back to the storage array, allowing them to be re-utilised by other devices/virtual machines. It wasn’t until the release of vSphere 6.5 in November 2016 that the much sought-after automation of disk space reclamation was announced with the availability of VMFS 6. More on VMFS 6 and automated reclamation in a future post.

In this article we will cover the manual process of reclaiming deleted blocks from a VMFS 5 (or earlier) file system via a esxcli UNMAP call.

Reclaim VMFS Deleted Blocks via UNMAP

Procedure

We have two options when making an UNMAP call on a VMFS volume:

Option 1 – Reference the Volume Label

1. Identify the volume/datastore label.

Reclaim VMFS Deleted Blocks via UNMAP Identify Volume Name

2. Via SSH, connect to an ESXi host which has access to the datastore in question.

3. Run the below command to perform an UNMAP call utilising the volume label.

esxcli storage vmfs unmap -l DatastoreName

Option 2 – Reference the Volume UUID

1. Identify the UUID of the datastore/volume in question.

Reclaim VMFS Deleted Blocks via UNMAP - Identify Volume UUID

2. Via SSH, connect to an ESXi host which has access to the datastore in question.

3. Run the below command to perform an UNMAP call utilising the volume UUID.

esxcli storage vmfs unmap -u 5b16dbfa-1f62fe12-25f4-000c2981428e

As an example, the below screenshots detail a storage volume before and after an UNMAP call. Over time, the storage volume has experienced a high number of VM deletions and storage vMotions. Following either of the above UNMAP commands, the volume has reclaimed over 2 TB of deleted blocks.

Reclaim VMFS Deleted Blocks via UNMAP - Before
Reclaim VMFS Deleted Blocks via UNMAP – BEFORE
Reclaim VMFS Deleted Blocks via UNMAP - After
Reclaim VMFS Deleted Blocks via UNMAP – AFTER

Monitoring UNMAPs via ESXTOP

Finally, it’s nice to be able to monitor such actions and, via ESXTOP, we can. Connect to one of your hosts via SSH and launch ESXTOP. There is going to be a lot of information displayed at this point, so we’ll likely need to toggle-off some of the superfluous information. Press ‘U’ to view disks/devices, and press ‘F’ to launch the currently displayed field order. In the below screenshot I have toggled-off all columns except A, B, and O.

From the below screenshot you can see that, following a little housekeeping on two volumes in my environment, the DELETE counters display the UNMAP I/O count issued to those devices. Note, ESXTOP counters are reset with each host restart.

VAAI UNMAP Monitoring via ESXTOP - DELETE Counter displaying UNMAP I/O
VAAI UNMAP Monitoring via ESXTOP – DELETE Counter displaying UNMAP I/O

With VMFS 6 now available, you’ll probably want to leverage it’s automated reclamation capabilities, however, the only upgrade path is to create new datastores, migrate your workloads, and blow away the old VMFS 5 datastores. More on VMFS 6 in a future post.

201804_Editing_Protected_VMs_in_vCenter_01

Editing Protected VMs in vSphere

By design, there are certain virtual machines and/or appliances within vSphere which are protected to prevent editing (this can include NSX Controllers, Edges, Logical Routers, etc.) In a live/production environment, you’d not normally care about editing these appliances, however, in a lab environment (especially one where resource is tight), reducing memory and/or CPU allocation can help a lot. As such, this article will cover the process of removing the lock on protected VM in vSphere, in order to enable editing.

The scenario: a customer needs to reduce the resource allocation of an NSX Controller, however, due to the VM in question being protected/locked, editing the VM’s resources is not possible via UI or PowerCLI.

The process of removing this lock is quick and easy, however, we first need to identify the virtual machine’s Managed Object Reference (moRef ID). Please note, VMware do not support or recommend this procedure in any way.  As such, this procedure should not be implemented in a production environment.

Continue reading → Editing Protected VMs in vSphere

vSphere vCenter Server Migration Featured

VMware vSphere 6.5: Migration from Windows vCenter to vCenter Server Appliance

Following on from my previous posts (What’s New in vSphere 6.5 and VMware VCSA 6.5: Installation & Configuration), a major area for discussion (and excitement) is the VMware Migration Assistant which, should you wish, is able to easily migrate you away from the Windows-based vCenter Server to the Linux-based vCenter Server Appliance (VCSA).

There are pros and cons to the vCenter appliance of course, as well as a healthy number of supporters in each camp, but if you fancy shaving some licensing costs (Windows Server and SQL Server), would like to enjoy a faster vSphere experience (since 6.0), or would just like to be able to take a quick backup of vCenter without having to either snapshot both Windows and SQL Servers elements, or by utilising your backup product of choice to take a full image of your environment, you might just want to take VCSA for a spin.

This post will detail the migration process of a Windows-based vCenter 6.0.0 U2 to vCenter Server Appliance 6.5.

vSphere vCenter Server Migration Featured

Migration Process

1. Via the Windows Server hosting vCenter Server, mount the VCSA installation media, and launch the VMware Migration Assistant (\migration-assistant\VMware-Migration-Assistant.exe). It is imperative that the Migration Assistant is left running throughout the entire migration process, and not stopped at any stage. If the Migration Assistant is stopped, the migration process will need to be restarted from scratch.

2. Leave the assistant running and, via a management workstation, server, etc., mount the VCSA installation media and launch the vCenter Server Appliance Installer (path). Click Migrate to start the process.

3. Click Next.

4. Accept the EULA and click Next.

5. Enter the details and SSO credentials for the source Windows vCenter Server (i.e. – the one which is currently running the Migration Assistant…it is still running, right?) Once complete, click Next.

6. Verify the certificate thumbprint and accept by clicking Yes.

7. Specify a target ESXi host or vCenter Server and SSO credentials. Here, I have specified my vCenter Server, still managing my lab environment. Once complete, click Next.

8. Verify the certificate thumbprint and accept by clicking Yes.

9. Specify a destination VM Folder where your new vCenter Server Appliance will be created.

10. Specify the compute resource destination. Here, I have chosen a generic compute cluster, and I’ll leave the rest to DRS.

11. Configure the new target appliance with a VM name and root credentials.

12. Choose your deployment size. For my lab environment, and for this article in particular, I’ve opted for a ‘Tiny’ deployment.

13. Specify a target datastore to house the appliance, and enable thin (or not) disk provisioning.

14. Configure the network settings accordingly. Here, my VCSA will be housed on a vSphere Distributed Switch port group (vDS_VL11_Servers). The temporary TCP/IP configuration will be removed during the finalisation of the migration process, as the original IP configuration will follow the migrated appliance.

15. Review your configuration and click Finish.

16. The migration will now begin and you will be able to track the process via a number of updates.

17. Throughout the migration process, you will note the new appliance being deployed via vSphere as per below screenshots.

18. Stage 1 is now complete. To start Stage 2, click Continue.

19. Click Next.

20. Following pre-migration checks, you will be prompted to specify AD user credentials. Once complete, click Next.

21. Choose what data you wish to migrate, and click Next.

22. Opt in/out of the CEIP and click Next.

23. Review your configuration and click Finish, but ensure you have a backup of your vCenter server and its database before proceeding. You have been warned!

24. Click OK to acknowledge the Shutdown Warning.

25. Migration of the Windows Server-based vCenter Server to vCenter Server Appliance will now begin.

26. The transfer process will now begin and will progress through the below three steps. You might want to grab a cup of coffee (or three) at this stage while the migration progresses.

27. Once complete, we’re done. Log in to the vCenter Server Appliance and away to go.

VMware vSphere: Locked Disks, Snapshot Consolidation Errors, and ‘msg.fileio.lock’

A reoccurring issue this one, and usually due to a failed backup. In my case, this was due to a failure of a Veeam Backup & Replication disk backup job which had, effectively, failed to remove it’s delta disks following a backup run. As a result, a number of virtual machines reported disk consolidation alerts and, due to the locked vmdks, I was unable to consolidate the snapshots or Storage vMotion the VM to a different datastore. A larger and slightly more pressing concern that arose (due to the size and amount of delta disks being held) meant the underlying datastore had blown it’s capacity, taking a number of VMs offline.

So, how do we identify a) the locked file, b) the source of the lock, and c) resolve the locked vmdks and consolidate the disks?

snapshot_consolidation_disklocked_01
Disk consolidation required.
snapshot_consolidation_disklocked_02
Manual attempts at consolidating snapshots fail with either DISKLOCKED errors…
...and/or 'msg.fileio.lock' errors.
…and/or ‘msg.fileio.lock’ errors.
snapshot_consolidation_disklocked_03
Storage vMotion attempts fail, identifying the locked file.

Identify the Locked File

As a first step, we’ll need to check the hostd.log to try and identify what is happening during the above tasks. To do this, SSH to the ESXi host hosting the VM in question, and launch the hostd.log.

tail -f /var/log/hostd.log

While the log is being displayed, jump back to either the vSphere Client for Windows (C#) or vSphere Web Client and re-run a snapshot consolidation (Virtual Machine > Snapshot > Consolidate). Keep an eye on the hostd.log output while the snapshot consolidation task attempts to run, as any/all file lock errors will be displayed. In my instance, the file-lock error detailed in the Storage vMotion screenshot above is confirmed via the hostd.log output (below), and clearly shows the locked disk in question.

snapshot_consolidation_disklocked_06
File lock errors, detailed via the hostd.log, should be fairly easy to identify, and will enable you to identify the locked vmdk.

Identify the Source of the Locked File

Next, we need to identify which ESXi host is holding the lock on the vmdk by using vmkfstools.

vmkfstools -D /vmfs/volumes/volume-name/vm-name/locked-vm-disk-name.vmdk

We are specifically interested in the ‘RO Owner’, which (in the below example) shows both the lock itself and the MAC address of the offending ESXi host (in this example, ending ‘f1:64:09’).

snapshot_consolidation_disklocked_04

The MAC address shown in the above output can be used to identify the ESXi host via vSphere.

snapshot_consolidation_disklocked_05

Resolve the Locked VMDKs and Consolidate the Disks

Now the host has been identified, place in Maintenance Mode and restart the Management Agent/host daemon service (hostd) via the below command.

/etc/init.d/hostd restart

snapshot_consolidation_disklocked_06

Following a successful restart of the hostd service, re-run the snapshot consolidation. This should now complete without any further errors and, once complete, any underlying datastore capacity issues (such as in my case) should be cleared.

snapshot_consolidation_disklocked_07

For more information, an official VMware KB is available by clicking here.

VMware vCenter Server Appliance 6.5: Installation & Configuration

Following the general release of VMware vSphere 6.5 last month (see my What’s New in VMware vSphere 6.5 post), I’ll be covering a number of technical run-throughs already in discussion throughout the virtual infrastructure community.

We’ll be starting with a fresh installation of the new and highly improved vCenter Server Appliance (VCSA), followed by a migration from the Windows-based vCenter Server 6.0; the latter task made all the easier thanks to the vSphere Migration Assistant. More on this to come. Lastly, I’ll be looking at a fresh installation of the Windows-based product, however, the experience throughout all of these installation/migration scenarios has been vastly improved.

So, first up then, let’s take a quick look at a fresh installation of the new vCenter Server Appliance, the installation and configuration of which can take just 20 minutes.

1. Log on to a domain-joined server, mount the VCSA installation media, and click Install (more on the Upgrade, Migrate, and Restore options in future posts).vcsa-6_5_installation_01

2. Click Next at the Introduction screen.vcsa-6_5_installation_02

3. Accept the EULA and click Next.
vcsa-6_5_installation_03

4. For this installation, we will be deploying the vCenter Server with an Embedded Platform Services Controller. Once done, click Next.vcsa-6_5_installation_04

5. Configure the Appliance Deployment Target by entering the target ESXi host, HTTPS port, and user credentials, and click Next.vcsa-6_5_installation_06

6. Configure up the appliance virtual machine by specifying a VM name, and Root credentials.
vcsa-6_5_installation_07

7. Select your deployment size; for this example, Tiny will suffice. Once done, click Next.vcsa-6_5_installation_08

8. Select a suitable datastore for the new VM, and click Next.vcsa-6_5_installation_09

9. Configure the network settings accordingly, and click Next.vcsa-6_5_installation_10

10. Confirm the configuration and click Finish once happy.vcsa-6_5_installation_11

11. Stage 1 of the installation (appliance deployment) will now begin.vcsa-6_5_installation_12

12. Once installation is complete, click Continue to configure the appliance.vcsa-6_5_installation_13

13. Click Next at the Introduction screen.vcsa-6_5_installation_14

14. Configure NTP settings and click Next.vcsa-6_5_installation_15

15. Complete the vCenter SSO configuration, and click Next.vcsa-6_5_installation_16

16. Opt in/out of the VMware Customer Experience Improvement Program and click Next.vcsa-6_5_installation_17

17. Review the Summary and click Finish.vcsa-6_5_installation_18

18. Stage 2 of the installation (set up vCenter Server Appliance) will now begin.vcsa-6_5_installation_19

19. Once complete, you will be presented with your FQDN for your new vCenter Server.vcsa-6_5_installation_20

Looking at the console of the vCSA, and we are presented with a very familiar grey and blue (instead of grey and yellow) interface. Appliance URLs are visible here, as well as basic management/configuration tasks.

vcsa-6_5_installation_21

The new vCenter Server Appliance can now be accessed via the default URLs and, depending on your choice of interface (either the new vSphere Client or older vSphere Web Client), there are now two URLs to remember.

  • vSphere Web Client – http://<vcenter_fqdn>/vsphere-client)
  • vSphere Client – http://<vcenter_fqdn>/ui.
vcsa-6_5_installation_22
A warm welcome to the fast and sleek HTML 5 vSphere Client.

Both clients will be running in parallel until further notice, but do remember that the new vSphere Client is yet offer full functionality; VMware state they are working on this area with priority, and I’ll be interested to see how quickly the day-to-day management functionality is added.

Integrating Active Directory with VMware vSphere SSO

One item I see mentioned fairly often, either in relation to personal labs or production environments, is the integration of vSphere SSO with Active Directory. Configuring vSphere’s SSO/AD integration via LDAP is a simple process, more so thanks to vSphere 6.5.

1. Login to the VMware vSphere Web Client using the vCenter Single Sign-On user credentials configured as part of the VMware vCenter Server installation.

sso_ad_integration_01

2. Browse to Administration > Single Sign-On > Configuration and click the Identity Services tab.

sso_ad_integration_02

3. Click the Add Identity Source icon, select Active Directory as an LDAP Server, and click Next.

sso_ad_integration_03

4. Configure the new identity source accordingly and click Next.

sso_ad_integration_04

5. Confirm the summary and click Finish.

sso_ad_integration_05

6. Select your new identity source and click the Set as Default Domain icon.

sso_ad_integration_06

Next, we’ll add an Active Directory Security Group to the vSphere Global Permissions, enabling us to test SSO functionality.

7. Browse to Administration > Access Control > Global Permissions, and click the Add Permission icon.

sso_ad_integration_07

8. Via the Add Permission wizard, click Add.

sso_ad_integration_08

9. Select your domain, recently added via the LDAP identity source, and add the required security group.

sso_ad_integration_09

10. Your added security group will now display, allowing you to logout and back in utilising your domain credentials.

sso_ad_integration_10