VMware Horizon Events Database – Annual Clean-up (purge old data)

14 May

VMware Horizon doesn’t restrict the growth of the historical tables in the Horizon Events database. VMware has a detailed knowledge base article with describes in details Purging old data from the View Events Database (2150309). However, there is a catch if you are trying to delete many records at one time, you will get transaction log full error. The below procedure will help you overcome the challenge. In our scenario, we purge the records once every year.

use HZNLOG
select count(*) from [dbo].[POD1_event_data_historical] where EventID in (select EventID from [dbo].[POD1_event_historical] where Time < '2021-01-31 00:00:00.000')
select count(*) from [dbo].[POD1_event_historical] where Time < '2021-01-31 00:00:00.000'

In the above example HZNLOG is the name of the database. POD1 is the prefix of the Horizon Events Database (Check in Horizon Admin console) and 2021-01-31 is the YYYY-MM-DD format (Show me all records before 31st Jan 2021)

No. of older records in Events DB

If we used the delete tables mentioned within the knowledge base article, we get the following error “The transaction log for database ‘HZNLOG’ is full due to ‘LOG_BACKUP”. Of course, the number of records in our case we are trying to delete is relatively high(Millions).

Error during deletion “Log is full”

You can shorten the above query for approx. 30 or 15 days, but still in our scenario, one would have to run the delete query more than 15 times to perform the annual clean-up. After searching around, I came across a blog post – Deleting millions of records from a table without blowing the transaction log (A big thank you Merill for sharing his knowledge) I tweaked it for my usecase of Horizon Events DB clean-up and, in a single query within 20 mins I could perform a yearly clean-up without any fuss of transaction log getting full. Essentially this performs the clean-up in a batch size of 10,000 row counts.

DECLARE @continue INT
DECLARE @rowcount INT
 
SET @continue = 1
WHILE @continue = 1
BEGIN
    PRINT GETDATE()
    SET ROWCOUNT 10000
    BEGIN TRANSACTION
	delete from [dbo].[POD1_event_data_historical] where EventID in (select EventID from [dbo].[POD1_event_historical] where Time < '2021-01-31 00:00:00.000')
	delete from [dbo].[POD1_event_historical] where Time < '2021-01-31 00:00:00.000'
    SET @rowcount = @@rowcount 
    COMMIT
    PRINT GETDATE()
    IF @rowcount = 0
    BEGIN
        SET @continue = 0
    END
END

The ouput will look something like below:

Enteire deletion in batches of 10K rows

After running the above deletion query, now re-run the select query to see if records exist before 31st Jan 2021, and now we have 0 records.

Zero records found

I hope you will find this SQL query helpful to perform Horizon Events Database clean-up in a jiffy. My request if you further enhance the query or make it more creative, I hope you can share it back with me?

Thanks,
Aresh Sarkari

Upgrade VMware Identity Manager 3.3 to VMware Workspace ONE Access 20.01

28 Apr

I had the opportunity to work on an upgrade from VMware Identity Manager 3.3 (VIDM) to the new name VMware Workspace ONE Access 20.01 (WoA), and I would like to share the entire experience with you. There is guidance available on the VMware documentation and a few blogs. The idea here is not to provide you with a step by step guide instead, provide guidance on best practice, insights on active/passive site, change timings, an end-2-end mind map of activities/steps involved etc., on carrying out a successful upgrade.

Environment Overview
Let take a look at the environment details to provide an high-level overview:
Active Site

  • 3 VMware Identity Manager 3.3 Linux Appliances
  • 2 VMware Identity Manager 3.3 Connecter Linux Appliances (Used for Authentication & VMware Horizon Sync)
  • SQL Database on Microsoft SQL 2016 Always-on
  • The 3 Manager Appliances are behind an NSX Load balancer

Passive Site

  • 3 VMware Identity Manager 3.3 Linux Appliances (Read-only mode)
  • 2 VMware Identity Manager 3.3 Connecter Linux Appliances (Used for Authentication & VMware Horizon Sync)
  • SQL Database on Microsoft SQL 2016 Always-on (Replica DB’s)
  • The 3 Manager Appliances are behind an NSX Load balancer

The offline upgrade method was selected as the choice due to convenience and ease of setup/configuration without exposing the appliance on the internet using proxy. During both, version upgrades the offline package was kept in the /tmp directory, which deletes the files post the reboot.

Downtime Window (Choice)

We had an option of performing the entire upgrade of the above components in a single day change, or we could split the upgrade into two days as we had to go from version 3.3 –> 19.03 –> 20.01.

VIDM TO WoA Upgrade Approach

Initially, we tried the single downtime change window of 16 hours and had hiccups which I plan to write a separate blog post. We split the change into two days. Day 1 – Upgrade from VIDM 3.3 to 19.03 and Day 2 – Upgrade from VIDM 19.03 to WoA 20.01 on two consecutive days giving us the ability for partial rollback instead of starting from scratch again.

High-Level Upgrade Architecture Overview

Disable the IDM – Manager node one at a time under the NSX load balancer and carry out the upgrade of the manager nodes one by one. After all, the manager nodes are upgraded to the desired version then move to the connector nodes one by one. In our scenario this had to repeated during the 19.03 to 20.01 Access node upgrade.

High-Level VIDM to WoA Upgrade Architecture

Observations from the upgrade

  • Check the VMware Product Interoperability Matrix and Product release notes at least two times before working upon the upgrade.
  • Before you begin the upgrade – Suspend the Data Movement on your SQL Always-on the Database.
  • There is no downtime observed when you perform an upgrade on one manager at a time. Make sure you disable the node from the load balancer (No traffic flows to the node).
  • No downtime is observed when connector upgrade are carried out one by one. There were four connectors for redundancy (3 Connectors performing the Authentication Function and 1 Connector – Sync and Authentication). However, the connector chosen for the AD Sync was the last one for the upgrade and in our plan we had mentioned downtime.
  • The System Dashboard – Health of the cluster (Active/Passive) may flip between green and red because the elastic search services take time to stabilize due to the reboots.
  • If you have hotfixes provided by VMware engineering due to previous issues, please check with support whether the fixes have been incorporated into the newer version or/else make sure to ask for the hot patch for the recent version. #ProTIP – Install those hotfixes before the final reboot of the upgrade to avoid an additional service restart dedicated to the hotfix.

End to end mind map of the entire Upgrade

I have included a pdf version of the mind map to read the details with zoom on.

Upgrade VIDM to WoA mind map.
Offline Upgrade VIDM to WoA – Mind Map

I hope you will find this helpful information to plan and succeed in a VMware Workspace ONE Access upgrade. A big thanks to Jishan T S, my teammate, for his continuous contributions to making this a big success and trying all the steps in the development setup multiple times.

Thanks,
Aresh Sarkari

VMware App Volumes – Volumes were not mounted due to an issue with your Writable Volume

18 Mar

Random floating desktop pools within our environment would exhibit issues where in the end-user would login to their desktop and they will be presented with a black screen with the message – Volumes were not mounted due to an issue with your Writable Volume. Please try logging in again, or contact your administrator.

Error

When this issue would surface, neither the AppStacks nor Writable Volumes would mount to the end-user desktop and if the end-user clicked on OK the session would log-off.

Environment Details

VMware Horizon 7.11
VMware App Volumes 2.18.5
VMware Dynamic Environment Manager 9.10
Windows 10 1909 Enterprise

Process of elimination

  • The App Volumes (AV) agent is able to communicate to the AV Manager on port 443 without any issues.
  • There were no SSL errors or load balancing issues communicating with the Agent/Manager.
  • We thought a particular Writable Volumes (WV) would be causing the issue. Deleted and re-created the WV still the issue would persist.
  • The issue would happen randomly for few users again and again.

Resolution

My team managed to open a VMware GSS case handled by Sanjay SP (A very helpful support engineer), he mentioned there were quite a few cases opened on a similar pattern. Following were the assessments from our logs:

  • During the first startup of Instant Clones, App Volumes Agent queries below registry key to know the customization status and updates manager with the same
    • [HKEY_LOCAL_MACHINE\SOFTWARE\VMware, Inc.\ViewComposer\ga\AgentIntegration]
      • “CustomizationState”=dword:1
  • It has a timeout of 300 seconds, and if this task times out AppVolumes manager will fail to create a unique identity for the VM in its database
  • In the App Volumes Agent logs, we see the respective timeout
    • [2021-03-09 07:11:34.009 UTC] [svservice:P1564:T1976] HandleNGVC: Waiting for NGVC to complete (count 299)
    • [2021-03-09 07:11:34.009 UTC] [svservice:P1564:T1976] Timed out waiting on NGVC after 300 seconds, disabling
  • The customization itself is working fine and we do see the registry entries getting updated with appropriate values. However, its not completed within 300 seconds. 

Fix

  • The delay in cloneprep customization was not found with IPv6 disabled on the primary nic adapter. The recommendation was to disable IPv6 since we don’t use it within the NIC adapter properties.
Disable IPv6 in the network adapters

I hope you will find this information useful if you encounter the issue. If you manage to tweak or improvise further on this solution, please don’t forget to keep me posted.

Thanks,
Aresh Sarkari

Internet Explorer crashing on Windows Server 2016 – Remote Desktop Session Host

18 Feb

We encountered a strange issue on the Windows Server 2016 Remote Desktop Session Host (RDSH) used for VMware Horizon Application Publishing. The Internet Explorer would launch and get into “Not Responding” state, and eventually, the process would close out without any errors.

IE Opening and Crashing

Process of elimination

  • We thought either Windows cumulative updates introduced the issue as it was working fine earlier.
  • There were no errors in the Windows Event Viewer (Application, System or Internet Explorer)
  • We used the Deployment Image Servicing and Management (DISM) command line tool to disable/enabled Internet Explorer without any luck.
    • dism /online /Disable-Feature /FeatureName:Internet-Explorer-Optional-amd64
    • dism /online /Enable-Feature /FeatureName:Internet-Explorer-Optional-amd64
  • Procmon is showing IE tries to launch the process multiple times, but the sub-process keep failing, and IE finally gives up at the end
IE Process launching multiple times
  • We were running out of troubleshooting ideas

Resolution

My team ended up opening a Microsoft Support case, and they could see that “Name Not Found for the ieproxy.dll” which is due to ieproxy.dll registration issues. Support confirmed they had seen similar instances in the past.

Please open command prompt with Admin rights and re-register the dll from System32 and Syswow64 folders.

%SystemRoot%\System32\regsvr32 ieproxy.dll

%SystemRoot%\Syswow64\regsvr32 ieproxy.dll

 I hope you will find this information useful if you encounter the issue. If you manage to tweak or improvise further on this solution, please don’t forget to keep me posted.

Thanks,
Aresh Sarkari

Horizon VDI – Calculator – Photos – Edge Not launching for end-users – Windows 10

8 Feb

In Windows 10 1909 VMware OST optimized image the end-users report they cannot open the following three built-in UWP windows application.

  • Microsoft Calculator
  • Microsoft Photos
  • Microsoft Edge browser

When the end-users try to open any of the three applications, nothing would happen – No error messages or pop-ups. The application doesn’t launch.

Environment Details

VMware Horizon 7.11
VMware App Volumes 2.18.5
VMware Dynamic Environment Manager 9.10

Process of elimination

  • The AppX package for (Calc, Photos and Edge) did exist in the base operating system
  • We can launch all the three applications within the optimized golden image template.
  • We were running the VMWare OSOT tool with the default VMware Windows 10 template. No additional customization or options selected.
  • One thing was evident the base template was working fine. The suspicion was around AppStack – App Volumes (We disabled the AppStacks/Writable Delivery – Same issue observed) or Dynamic Environment Manager causing the application from launching
  • We were running out of troubleshooting ideas

Resolution

Upon searching, I came across this community page – https://communities.vmware.com/t5/Horizon-Desktops-and-Apps/Windows-10-UWP-Applications-and-Taskbar/m-p/523086 and it outlined a solution of re-registering the UWP AppX package for the built-in application. We tried the fix in the DEV environment and it worked. Further it was replicated to the production setup.

Step 1: A Powershell script to register the AppX packages

Get-AppxPackage -allusers *windowscalculator* | Foreach {Add-AppxPackage -DisableDevelopmentMode -Register “$($_.InstallLocation)AppXManifest.xml”}
Get-AppxPackage -allusers *windows.photos* | Foreach {Add-AppxPackage -DisableDevelopmentMode -Register “$($_.InstallLocation)AppXManifest.xml”}
Get-AppXPackage -AllUsers *edge* | Foreach {Add-AppxPackage -DisableDevelopmentMode -Register "$($_.InstallLocation)AppXManifest.xml"}

Step 2 : Create a Dynamic Environment Manager – Logon Tasks

We selected to put the Powershell script within the UEM Share as the end-users have the read- access.

DEM - Logon Task
DEM-LogonTasks

 I hope you will find this information useful if you encounter the issue. If you manage to tweak or improvise further on this solution, please don’t forget to keep me posted.

Thanks,
Aresh Sarkari

Script create read-only account for monitoring VMware Unified Access Gateway

23 Sep

We have been using VMware Unified Access Gateway (UAG) for quite a few years. To monitor the appliance using vROPS or other monitoring tools or API calls scripts you need a read-only monitoring account created in the console under “Account Settings”.

Account Settings - UAG
Read-only account for monitoring

In our deployment we have 14 UAG appliances (Internal/External) – Yes we tunnel internal connections too. Post the upgrade we had to re-create the read-only account for the API call monitoring on all 14 appliances. The following script I wrote to create the read-only account per UAG server. Just change the IP and point to another UAG to create accounts.

####################################################################
# Create ready-only account in the VMware Unified Access Gateway Appliance
# for monitoring purposes using vROPS or API etc.
# Author - Aresh Sarkari (@askaresh)
# Version - V5.0
####################################################################


# Ignore UAG cert errors (self signed or 

add-type @"
    using System.Net;
    using System.Security.Cryptography.X509Certificates;
    public class TrustAllCertsPolicy : ICertificatePolicy {
        public bool CheckValidationResult(
            ServicePoint srvPoint, X509Certificate certificate,
            WebRequest request, int certificateProblem) {
            return true;
        }
    }
"@
[System.Net.ServicePointManager]::CertificatePolicy = New-Object TrustAllCertsPolicy
[System.Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]'Ssl3,Tls,Tls11,Tls12'


##API Call to make the intial connection to the UAG Appliance##

$Uri = "https://10.0.0.1:9443/rest/v1/config/adminusers/logAdminUserAction/LOGIN"
$Username = "admin"
$Password = "adminpassword"

$Headers = @{ Authorization = "Basic {0}" -f [Convert]::ToBase64String([Text.Encoding]::ASCII.GetBytes(("{0}:{1}" -f $Username,$Password))) }

Invoke-RestMethod -SessionVariable DaLogin -Uri $Uri -Headers $Headers


###API Call to create the user account with read-only access under VMware Unified Access Gateway##

$body = @{
  name = "UAG_vRops"
  password= "typeyourpassword"
  enabled=$true
  roles = @("ROLE_MONITORING")
  noOfDaysRemainingForPwdExpiry=0
} | ConvertTo-Json

$output = Invoke-RestMethod -WebSession $DaLogin -Method Put -Uri "https://10.0.0.1:9443/rest/v1/config/adminusers" -Body $body -ContentType "application/json"

Write-Output $output

GitHub https://github.com/askaresh/scripts/blob/master/uagreadonlyacct

I hope you will find this script useful to create the UAG read only accounts and would not have to create them manually on multiple appliances. My request if you further enhance the script or make it more creative, I hope you can share it back with me?

Thanks,
Aresh Sarkari

Unable to uninstall/upgrade VMware Horizon Client within the VMware App Volumes AppStack

22 Jul

We had a very long ongoing issue wherein we couldn’t uninstall or upgrade the VMware Horizon Client within the AppStack. We had successfully installed the Horizon Client within the AppStack. However, when it was time to perform an upgrade or uninstall to the latest version, it would fail during a reboot with the following error.

Unknown HardError

We initially saw the issue on App Volumes 2.14. While we were troubleshooting for an extended period, we upgrade to App Volumes 2.18.1, and both the versions exhibited the same failure during uninstall or upgrade.

Process to reproduce the error

  • Upgrade horizon client –> reboot –> hard error
  • Uninstall horizon client –> reboot –>hard error
  • Uninstall horizon client –> install horizon client –> reboot –> hard error
  • Upgrade horizon client –> complete provisioning without reboot –> completes successful –> during next update of AppStack it crashes with Hard error
  • Uninstall horizon client –> complete provisioning without reboot –> completes successful –> during next update of AppStack it crashes with Hard error

Environment Details

VMware Horizon 7.11
VMware App Volumes 2.18.1
VMware Dynamic Environment Manager 9.10
VMware Horizon Client 5.x

Process of elimination

  • Upgrade the Horizon Client to the various 5.x version to remove any version specific Client related issues
  • We didn’t have Antivirus running on the AppStack capturing template
  • We could build the AppStack from scratch with the newer version of Horizon Client but only upgrade/uninstall would fail
  • We were honestly running out of troubleshooting ideas

Resolution

After trying out all the usual steps and avoid re-creating AppStack every single time during life cycle management, we managed to open a VMware GSS case handled by Karan Ahuja(Very helpful support engineer), which ended been worked by the engineering team(Art Rothstein – Champ in AV Eng Team). Note quite alot of logs and Procmon were exchanged from the problematic application capturing VM template. Finally, the fix was determined as a AppStack snapvol.cfg exclusion. After putting this exclusion into the AppStack – App capturing VM during provisioning we could upgrade or uninstall Horizon Client.

exclude_registry=\REGISTRY\MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileService
Path exclusion in AppStack snapvol.cfg

Disclaimer – Due to the nature of the issue and time taken to resolve it we decided to move the Horizon Client from AppStack into the base image. However, the fix is validated, and 100% working post the exclusion.

I hope you will find this information useful if you encounter the issue. A big thanks to Manivannan Arul my teammate for his continuous effort while troubleshooting with GSS over a period of 4+ months.

Thanks,
Aresh Sarkari

Black screen when re-connect to VMware Horizon virtual desktop

27 May

We had an issue after we upgraded our EUC Stack, especially VMware App Volumes 2.14 to 2.18.1. Quite a few end-users started reporting black screen when they were trying to re-connect to their desktops post the original session launch. This would mean re-connect post breaks, endpoint screen locks, next working day re-connections, etc.

EUC Environment Details:
VMware Horizon 7.11
VMware App Volumes 2.18.1
VMware Dynamic Environment Manager 9.10
VMware Horizon Client 5.x
VMware Workspace One 3.3

Process of elimination

  • If we re-created the writable volumes of the problematic end-users the black screen issue would go away. This provided us with a clue that the problem lied with VMware App Volumes – Writable Volumes
  • No errors/failures observed within the VMware DEM/Horizon logs
  • Upgrade the Horizon Client to the latest 5.x version to remove any Client related issues
  • We already had the necessary anti-virus exclusion based on VMware Antivirus Considerations in a VMware Horizon 7 Environment

Resolution
After trying out all the usual steps and avoid re-creating writable volumes for problematic end-users, we managed to open a VMware GSS case handled by Karan Ahuja(Very helpful support engineer), which ended been worked by the engineering team(Art Rothstein – Champ in AV Eng Team). Note quite alot of logs, memory dumps, and Procmon were exchanged from the problematic VM using various remote gathering techniques. Finally, the fix was determined as a writable volume snapvol.cfg exclusion. (In our case, the problem is caused by smss.exe using a copy of winlogon.exe that is on the writable volume). After putting this exclusion into all problematic end-users, they stopped seeing Black screen issues upon re-connect.

exclude_path=%SystemRoot%\System32\winlogon.exe
Path exclusion in writable volumes snapvol.cfg

In this blog, I am not outlining the steps on how to add the snapvol.cfg exclusion as my ex-colleague Daniel Bakshi outlines on a VMware blog post on how to do it step by step. I hope you will find this information useful if you encounter intermittent black screen issues.

Thanks,
Aresh Sarkari

Swagger-UI and Postman Collection for VMware Unified Access Gateway

6 May

I aimed to perform a particular VMware Unified Access Gateway (UAG) tasks programatically. After some guidance from Mark Benson he introduced me to the Swagger-UI that is available within the product.

To access the Swagger-UI on UAG open the following URL within the browser and enter your username and password.

https://uagnameorip:9443/swagger-ui/index.html
Swagger-UI – UAG API Calls

One can do alot within the swagger-ui to make various GET, POST, PUT actions. However, my preferred tool is POSTMAN. I needed a way to figure out how to get all the swagger-ui converted to POSTMAN. Upon searching, I came across this method mentioned here.

To fetch all the swagger JSON output, go to this URL on the VMware UAG Appliance.

https://uagnameorip:9443/rest/swagger.json

We have two options here. #Option1 – copy all the data from the webpage and paste it under Postman – Import – Paste Raw Text. You will have all the VMware UAG Access Gateway Rest API listed. #Option2 – Paste the above URL into Postman – Import – Import from link (This didn’t work for me maybe authentication was required)

Postman – Import

Please find attached the POSTMAN export for the VMware Unified Access Gateway Appliance 3.9.1. (Note I believe swagger-ui was availble post UAG 3.7 onwards).

Postman – API Calls UAG

I hope you will find this post useful to start using the Swagger-UI and Postman collections to begin working with UAG appliance. My request if you further create interesting scripts or perform cool activities with UAG appliance, I hope you can share it back with me?

Thanks,
Aresh Sarkari

Report all VMware App Volumes Writable Volumes with Status Disabled and Orphaned

22 Apr

Often within the App Volumes Manager, there are Writable Volumes that will show up as Status “Orphaned” and essentially that can be caused by active directory user accounts that have been disabled in AD.

Writable Status = Orphaned

There is also a Status called “Disabled” and that can be caused when an App Volumes administrator decides to disable the Writable Volumes.

Writable Status = Disabled

Now if you have a enteprise environment with 1000’s of users, it’s hard to perform this activity from the UI. I have created a script that can report on the status of “Orphaned” and “Disabled” send you the output in *.csv report on a daily/weekly basis as per your needs.

####################################################################
# Get List of Writable Volumes from AppVolumes Manager for Status Disabled and Orphaned
# Author - Aresh Sarkari (@askaresh)
# Version - V2.0
####################################################################

# Run at the start of each script to import the credentials
$Credentials = IMPORT-CLIXML "C:\Scripts\Secure-Creds\SCred_avmgr.xml"
$RESTAPIUser = $Credentials.UserName
$RESTAPIPassword = $Credentials.GetNetworkCredential().Password


$body = @{
    username = “$RESTAPIUser"
    password = “$RESTAPIPassword”
}

Invoke-RestMethod -SessionVariable DaLogin -Method Post -Uri "https://avolmanager.askaresh.com/cv_api/sessions” -Body $body

$output = Invoke-RestMethod -WebSession $DaLogin -Method Get -Uri "https://avolmanager.askaresh.com/cv_api/writables" -ContentType "application/json"

$output.datastores.writable_volumes | Select-Object owner_name, owner_upn, title, status | Where-Object {[string]$_.status -match "Orphaned" -and $_.title -match "(disabled)"} | Export-Csv -NoTypeInformation -Append D:\Aresh\Orphaned.Disabled-Writables.$(Get-Date -Format "yyyyMMddHHmm").csv

#send an email (provided the smtp server is reachable from where ever you are running this script)
$emailfrom = 'writablevolumes@askaresh.com'
$emailto = 'email1@askaresh.com', 'email2@askaresh.com'
$emailsub = 'Wrtiable Volumes with status Orphaned and Disabled - Weekly'
$emailbody = 'Attached CSV File from App Volumes Manager. The attachment included the API response for all the Writable which are orphaned and Disabled in UI'
$emailattach = "D:\Aresh\Orphaned.Disabled-Writables.$(Get-Date -Format "yyyyMMddHHmm").csv"
$emailsmtp = 'smtp.askaresh.com'

Send-MailMessage -From $emailfrom -To $emailto -Subject $emailsub -Body $emailbody -Attachments $emailattach -Priority High -DeliveryNotificationOption OnFailure -SmtpServer $emailsmtp

GitHub – https://github.com/askaresh/scripts/blob/master/wrtiable-orph-disa

Depending upon the output, you can have your service desk get in touch with the Active Directory teams to get the affected end-users to be removed from the App volumes writable volumes entitled groups and then proceed towards clean up of their writable volumes if there is no legal hold requirements.

I hope you will find this script useful to get a report for all writable volumes with status Orphaned and Disabled. My request if you further enhance the script or make it more creative, I hope you can share it back with me?

Thanks,
Aresh Sarkari