VCF 9.x Upgrade Stuck on vRNI / Aria Operations for Networks SSL Thumbprint Validation


During a VMware Cloud Foundation upgrade, you may hit a situation where the upgrade workflow fails on validation of the vRNI / Aria Operations for Networks certificate thumbprint.

Even after replacing the certificate directly on the vRNI appliance, clicking Retry in SDDC Manager may continue to fail with the old certificate thumbprint.

This can be confusing because the certificate on the vRNI side is already correct, but SDDC Manager still validates against the previous thumbprint.

Root Cause

The root cause is that SDDC Manager caches the SSL thumbprint either in its internal database, platformdb, or in the LCM / Domain Manager service memory when the upgrade task is first initialized.

As a result, even if the certificate is replaced on the vRNI / Aria Operations for Networks appliance, the Retry button does not automatically rediscover the new certificate thumbprint.

Instead, the retry operation may continue to use the old cached value.

To resolve this, the thumbprint stored in the SDDC Manager inventory database must be updated manually.

Warning:
This procedure modifies the internal SDDC Manager database. Use it only when you fully understand the impact. Always take a backup or snapshot of the SDDC Manager appliance before making manual database changes. In production environments, validate with VMware/Broadcom support first.


Step 1: Extract the New Certificate Thumbprint from vRNI

Log in to the SDDC Manager appliance via SSH.

Usually this means logging in as vcf and then switching to root:

su -

Now retrieve the SHA-256 fingerprint of the currently installed certificate on the vRNI / Aria Operations for Networks appliance:

echo -n | openssl s_client -connect <VRNI_FQDN_OR_IP>:443 2>/dev/null | openssl x509 -noout -fingerprint -sha256

Example output:

sha256 Fingerprint=XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX

Copy only the thumbprint value, without the prefix:

XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX

Step 2: Check the Current Thumbprint in SDDC Manager Database

Connect to the SDDC Manager PostgreSQL database:

psql -h localhost -U postgres -d platformdb

Now locate the vRNI / Aria Operations for Networks resource record:

SELECT id, type, status, ssl_thumbprint
FROM resource
WHERE type LIKE '%VRNI%'
OR type LIKE '%ARIA%';

Identify the row that belongs to your vRNI / Aria Operations for Networks appliance.

You should see that the ssl_thumbprint column still contains the old thumbprint, for example:

EF:0B:A2:15:...

Step 3: Update the Stored Thumbprint

Update the resource record with the new thumbprint:

UPDATE resource
SET ssl_thumbprint='<NEW_THUMBPRINT>'
WHERE id='<COMPONENT_ID>';

Example:

UPDATE resource
SET ssl_thumbprint='=XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX'
WHERE id='xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx';

Verify the change:

SELECT id, type, status, ssl_thumbprint
FROM resource
WHERE id='<COMPONENT_ID>';

Exit PostgreSQL:

\q

Step 4: Restart LCM and Domain Manager Services

SDDC Manager may still cache inventory data in memory, so restart the relevant services:

systemctl restart lcm
systemctl restart domainmanager

Wait a few minutes until the services are fully initialized.

You can monitor the LCM service log with:

tail -f /var/log/vmware/vcf/lcm/lcm.log

Step 5: Reset a Stuck Upgrade Task if Needed

In some cases, the upgrade task may remain stuck in an IN_PROGRESS state or the Retry button may stay unavailable.

If this happens, check the active execution tasks in the SDDC Manager database.

Connect again to PostgreSQL:

psql -h localhost -U postgres -d platformdb

Find tasks that are still marked as running:

SELECT id, status, action
FROM execution_task
WHERE status='IN_PROGRESS';

Identify the specific stuck task related to the failed upgrade validation.

Then manually mark it as failed:

UPDATE execution_task
SET status='FAILED'
WHERE id='<TASK_ID>';

Exit PostgreSQL:

\q

Step 6: Resume the Upgrade

Return to the SDDC Manager UI and refresh the upgrade page.

The upgrade workflow should now allow you to click Retry again.

This time, SDDC Manager should read the corrected thumbprint from the database, validate it against the current vRNI / Aria Operations for Networks certificate, and continue with the VCF 9.x upgrade.


Summary

If a VCF 9.x upgrade continues to fail on vRNI / Aria Operations for Networks certificate validation even after the certificate has been replaced, the issue may not be the certificate itself.

The problem can be caused by a stale SSL thumbprint cached in SDDC Manager.

The fix is to:

  1. Extract the new SHA-256 certificate thumbprint from vRNI.
  2. Update the corresponding ssl_thumbprint value in platformdb.
  3. Restart the lcm and domainmanager services.
  4. Reset the stuck execution task if required.
  5. Retry the upgrade from the SDDC Manager UI.

This is a useful recovery procedure when the UI retry mechanism continues to use stale inventory data instead of the actual certificate currently installed on the vRNI appliance.

Holodeck 9.0.2 with VCF 9.0.2.0 Stuck at Install-VcfInstallerBundles


Bug VCF 9.0.2.0 with Holodeck 9.0.2

While deploying VMware Cloud Foundation 9.0.2.0 with Holodeck 9.0.2, I hit an interesting issue during the bundle download phase.

The deployment did not fail with a clear error. Instead, it stalled indefinitely at:

Install-VcfInstallerBundles

At first glance, everything looked fine. The VCF Installer depot UI showed all required components as downloaded successfully. However, the Holodeck deployment kept waiting forever.

The root cause turned out to be a hardcoded bundle count check inside the Holodeck PowerShell module.


Environment

The issue was observed with the following setup:

Holodeck:        9.0.2
HoloRouter OVA: 9.0.2.0424
VCF Installer: 9.0.2.0
Target VCF: 9.0.2.0
Deployment: Full VCF, ManagementOnly
Depot: Online depot

Important detail: this was full VCF, not VVF.


Symptom

During deployment, the log repeatedly showed:

SddcMgmtDomain[<pid>]: [INFO] Received Bundles. Checking if all VCF 9 bundles are available
SddcMgmtDomain[<pid>]: [INFO] Didn't receive all bundles. Received 8 bundle details. Trying again after 10 seconds

This message repeated every 10 seconds.

At the same time, the VCF Installer UI showed all visible components as successfully downloaded:

SDDC Manager 9.0.2.0
VMware Cloud Foundation Automation 9.0.2.0
VMware Cloud Foundation Operations 9.0.2.0
VMware Cloud Foundation Operations Collector 9.0.2.0
VMware Cloud Foundation Operations fleet management 9.0.2.0
VMware NSX 9.0.2.0
VMware vCenter 9.0.2.0

So from the UI perspective, everything looked complete. However, Holodeck was still waiting.


Root Cause

The problematic logic is inside the Holodeck PowerShell module on the deployed HoloRouter:

/root/.local/share/powershell/Modules/HoloDeck/Modules/SddcMgmtDeployment.psm1

The affected function is:

Install-VcfInstallerBundles

Holodeck queries the VCF Installer API:

https://${HostName}/v1/bundles/download-status?imageType=INSTALL

Then it filters bundles matching the selected VCF version:

}elseif($Version -eq "9.0.2.0"){
$vcf9_bundle_details = $bundle_details.elements | Where-Object {$_.version -match "9\.0\.2\.0\.*"}
}

The problem is the final count check:

elseif($vcf9_bundle_details.count -eq 7){
Write-Log -Message "Received all Bundle Details"
$bundle_api_response = $true
}
else{
Write-Log -Message "Didn't receive all bundles. Received $($vcf9_bundle_details.count) bundle details. Trying again after 10 seconds"
Start-Sleep -Seconds 10
}

For VCF 9.0.2.0, the API returns 8 matching bundle entries, not 7.

That means this condition never becomes true:

$vcf9_bundle_details.count -eq 7

Holodeck receives 8 bundles, but waits for exactly 7.

Result: an infinite loop.


Why Does the API Return 8 Bundles?

In VCF 9.0.2.0, the bundle structure changed compared to earlier VCF 9 versions.

The depot UI shows 7 visible rows, but the API response contains 8 entries matching:

9.0.2.0.*

The 9.0.2 BOM appears to split some components more granularly, for example around VCF Operations, Operations Collector, and Fleet Management. The UI abstracts this nicely, but the API exposes one additional bundle-level entry.

The important part is this:

VCF Installer UI: 7 visible downloaded components
VCF Installer API: 8 matching bundle entries
Holodeck logic: expects exactly 7

That mismatch is enough to block the deployment.


Workaround

Edit the Holodeck module directly on the HoloRouter.

Change this:

}elseif($vcf9_bundle_details.count -eq 7){

To this:

}elseif($vcf9_bundle_details.count -ge 7){

One-liner:

sed -i 's/$vcf9_bundle_details.count -eq 7/$vcf9_bundle_details.count -ge 7/' \
/root/.local/share/powershell/Modules/HoloDeck/Modules/SddcMgmtDeployment.psm1

This changes the check from “exactly 7 bundles” to “at least 7 bundles”.


Restart the Running PowerShell Session

The currently running pwsh process already has the old function loaded in memory.

After patching the file, kill the running PowerShell process:

pkill -9 -f pwsh

Then start a fresh PowerShell session and resume the deployment:

Import-HoloDeckConfig -ConfigID <id>

New-HoloDeckInstance -Version 9.0.2.0 -InstanceID <same-as-before> <original flags>

For example:

New-HoloDeckInstance -Version 9.0.2.0 -InstanceID 1 -ManagementOnly

Use the same flags you used in the original deployment.


What Happens After the Patch?

After applying the workaround, Holodeck resumes at the existing deployment state.

In my case, the state engine resumed at:

Install-VcfInstallerBundles

The patched function immediately accepted the 8 returned bundle entries and logged:

Received all Bundle Details

The deployment then moved forward.

Some bundle download calls may return:

BUNDLE_DOWNLOAD_ALREADY_DOWNLOADED

That is expected because the bundles are already present in the depot. The existing try/catch handling allows the phase to complete quickly.

After that, the deployment advanced to the management-domain phase.


Suggested Proper Fix

The quick fix is:

- }elseif($vcf9_bundle_details.count -eq 7){
+ }elseif($vcf9_bundle_details.count -ge 7){

A version-specific fix would also work, for example:

VCF 9.0.0.0 / 9.0.1.0 -> expect 7
VCF 9.0.2.0 -> expect 8

However, that would likely reintroduce the same type of bug in a future VCF BOM revision.

A better long-term approach would be to avoid a hardcoded count completely and validate the actual bundle download state instead.

For example, Holodeck should check that all required bundles for the selected deployment type and version are present and successfully downloaded, instead of assuming that the bundle count is always static.

Still, as an immediate workaround, changing -eq 7 to -ge 7 is enough to unblock the deployment.


Important Notes

This is an unofficial workaround.

The affected PowerShell module is not part of the public Holodeck documentation repository. It is bundled inside the HoloRouter OVA distributed through the Broadcom Support Portal.

Before editing vendor-supplied files, it is always a good idea to make a backup:

cp /root/.local/share/powershell/Modules/HoloDeck/Modules/SddcMgmtDeployment.psm1 \
/root/.local/share/powershell/Modules/HoloDeck/Modules/SddcMgmtDeployment.psm1.bak

Then apply the patch.


Summary

This is a good example of a small hardcoded assumption causing a deployment to stall without an obvious fatal error.

The VCF Installer API was returning valid data. The depot was populated. The UI showed the bundles as successfully downloaded. But Holodeck was waiting for an exact number of bundle entries that no longer matched the VCF 9.0.2.0 BOM.

For affected Holodeck 9.0.2 users deploying VCF 9.0.2.0, the key symptom is:

Didn't receive all bundles. Received 8 bundle details. Trying again after 10 seconds

If you see this message, check the bundle count logic in:

SddcMgmtDeployment.psm1

The workaround is small, but it can save a lot of troubleshooting time.


Quick Reference

cp /root/.local/share/powershell/Modules/HoloDeck/Modules/SddcMgmtDeployment.psm1 \
/root/.local/share/powershell/Modules/HoloDeck/Modules/SddcMgmtDeployment.psm1.bak

sed -i 's/$vcf9_bundle_details.count -eq 7/$vcf9_bundle_details.count -ge 7/' \
/root/.local/share/powershell/Modules/HoloDeck/Modules/SddcMgmtDeployment.psm1

pkill -9 -f pwsh

Then resume:

Import-HoloDeckConfig -ConfigID <id>
New-HoloDeckInstance -Version 9.0.2.0 -InstanceID <same-as-before> <original flags>

KB 406901: Fix “Certificate authorities update failed” in the VCF Management CA Wizard (VCF 9.0)

When integrating a Microsoft Certificate Authority (CA) with VMware Cloud Foundation (VCF) Operations / Fleet Management in VCF 9.0, you may hit a frustrating blocker: the “Configure Certificate Authority for VCF Management” wizard fails with:

“Certificate authorities update failed”

This is documented in Broadcom KB 406901 and, importantly, it’s not always a connectivity or permissions problem—it can be a password character parsing issue.


What you’ll see

UI symptom

In the Configure Certificate Authority for VCF Management wizard, the validation/update step fails with:

  • Certificate authorities update failed

Log symptom (Fleet Management / VCF Operations appliance)

On the VCF Operations appliance, you’ll typically find a 401 Unauthorized in:

  • /var/log/vrlcm/vmware_vrlcm.log

Example (as shown in the KB):

  • Exception occurred while trying to validate Microsoft CA
  • HttpClientErrorException$Unauthorized: 401 Unauthorized
  • 401 - Unauthorized: Access is denied due to invalid credentials.

At first glance, this looks like wrong credentials or insufficient permissions. But KB 406901 highlights a very specific trigger.


Root cause (the “gotcha”)

This is a known issue with special characters in the CA service account password, specifically:

  • # or &

Even if the username/password are correct, the wizard’s CA validation request can fail in a way that surfaces as a 401 Unauthorized.


Resolution / Workaround (what to do now)

1) Reset the service account password

Change the Microsoft CA service account password to a value that does NOT contain:

  • #
  • &

Use a “safe” password character set (letters + numbers is the simplest) to avoid re-triggering the issue.

2) Re-run (or re-save) the CA configuration in the wizard

Go back to the Configure Certificate Authority for VCF Management wizard, enter the updated credentials, and run the validation/update again.

Link: Configure Certificate Authority for VCF Management fails with error, “Certificate authorities update failed”

Highlights from VMware Explore 2024 Barcelona

General Session Insights:

  • Private Cloud as the Future of Innovation: Broadcom’s President and CEO, Hock Tan, emphasized that private cloud has become the foundation for enterprise innovation.
  • Balancing AI with Compliance: Chris Wolf highlighted the importance of aligning AI advancements with organizational privacy and compliance needs.
  • Community Acknowledgments: Joe Baguley celebrated the vibrant VMware community, from 1,600+ vExperts to 150,000 VMUG members, for their global contributions.
  • Agile IT for Business Success: Paul Turner emphasized the need for IT agility to rapidly deliver applications and services, crucial for achieving business goals.
  • VeloRAIN Architecture: Sanjay Uppal introduced the VeloRAIN (Robust AI Networking) architecture, leveraging AI/ML to enhance distributed AI workloads’ performance and security.

For more details, check the general session recap.

Notable Sessions and Presentations:

  • Distributed Security Simplified: Dive into the intricacies of distributed security in VMware Cloud Foundation. Read more.
  • VMware vSAN ESA: Explore how VMware vSAN ESA serves as a robust storage platform for VMware Cloud Foundation. Details here.
  • Demystifying DPUs and GPUs: Understand the role of DPUs and GPUs in VMware Cloud Foundation for advancing AI and data workloads. Learn more.
  • AI Without GPUs: Discover innovative ways to harness CPU power for AI workloads in GPU-limited environments. Read more.
  • Data Unlocking with VMware and NVIDIA: Broadcom and NVIDIA offer deep insights into unlocking data potential through AI-powered solutions. Deep dive.
  • VMware Fusion and Workstation: Exciting news—VMware Fusion and Workstation are now available for free. Get the details.

For a complete list of VMware Explore 2024 Barcelona presentations, visit this link.

An inspiring evening of networking and meaningful conversations at the Community Leadership Reception at Explore in Barcelona.
The VMware Community truly is the heart of Explore.
It was an honor to meet Hock Tan CEO,

and engage with incredible leaders like Corey Romero hashtag#vExpert leader and Josef Zach, our hashtag#VMUG Czech leader.

VMware Cloud Foundation (VCF) Brownfield Deployments

VMware Cloud Foundation (VCF) provides a unified platform for managing hybrid clouds, but the deployment process differs between Greenfield (new) and Brownfield (existing) environments. Brownfield deployment involves integrating pre-existing infrastructure into the VCF framework.

Preparing to Use the VCF Import Tool

The VCF Import Tool is essential for transitioning existing infrastructure into the VCF framework. Here’s a step-by-step guide to preparing the tool:

  1. Download the Necessary Files:
    • SDDC Manager OVA: The foundation for managing VCF.
    • VCF Import Tool: Enables import and integration of existing infrastructure.
    • NSX Install Bundle: Configures the networking components for VCF.
  2. Deploy SDDC Manager:
    • This step is necessary for “convert” use cases to establish centralized management within VCF.
  3. Extract the Import Tool:
    • Transfer and configure the import scripts within the SDDC Manager.
  4. Copy NSX Bundle:
    • Ensure the NSX configuration is uploaded for seamless network integration.

Convert Workflow: Transitioning Infrastructure to VCF

The Convert Workflow addresses the challenge of adapting existing environments to align with VCF’s architecture. Follow these steps:

  1. Verify Prerequisites:
    • Confirm that SDDC Manager is running version 5.2 or later.
    • Ensure all required files (Import Tool, NSX bundles) are uploaded.
  2. Run Pre-Check Scripts:
    • Validate the current environment using the Import Tool’s pre-check capabilities. This step identifies configuration issues or incompatibilities.
  3. Create NSX JSON:
    • Generate a JSON file to map the existing network configurations into VCF’s NSX environment.
  4. Convert Management Domain:
    • This final step transitions the management domain to align with VCF’s integrated control and automation.

Import Workflow: Integrating Existing Components

For specific components or domains, the Import Workflow provides a framework to incorporate them into VCF:

  1. Check Prerequisites:
    • Confirm readiness by ensuring the infrastructure meets the required configurations.
  2. Generate NSX JSON:
    • Map existing NSX configurations into a JSON format suitable for VCF integration.
  3. Import Workload Domains:
    • Import and integrate vSphere and NSX components into the VCF ecosystem.

Sync Workflow: Maintaining Infrastructure Alignment

The Sync Workflow ensures continued alignment between the existing infrastructure and VCF:

  1. Verify Prerequisites:
    • Confirm that SDDC Manager is operational and all required scripts are present.
  2. Sync Workload Domain:
    • Synchronize the workload domains with VCF’s management systems, ensuring consistency and reliability.

VCF Import Tool Options and Parameters

Below is an overview of the key actions and parameters available in the VCF Import Tool:

1. Help and Version Commands

  • -h, --help
    Displays the help menu for the VCF Import Tool, outlining available commands and their usage.
  • -v, --version
    Shows the current version of the VCF Import Tool.

2. Core Actions for Brownfield Deployments

  • convert
    Converts an existing vSphere infrastructure into a management domain within SDDC Manager.
  • check
    Validates if a vCenter is suitable for import as a workload domain in SDDC Manager.
  • import
    Imports an existing vCenter as a VI workload domain into SDDC Manager.

3. Sync and Deployment Operations

  • sync
    Synchronizes configuration between an imported VI workload domain or a workload domain deployed from SDDC Manager. This helps manage configuration drift between vCenter Server and SDDC Manager.
  • deploy-nsx
    Deploys NSX Manager as a standalone operation. This is useful for preparing networking configurations for workload domains.
  • precheck
    Runs validation checks on a vCenter to identify any potential issues before starting the import or conversion process.

Demystifying DPUs and GPUs in VMware Cloud Foundation

At VMware Explore EU 2024, the session “Demystifying DPUs and GPUs in VMware Cloud Foundation” provided deep insights into how these advanced technologies are transforming modern data centers. Presented by Dave Morera and Peter Flecha, the session highlighted the integration and benefits of Data Processing Units (DPUs) and Graphics Processing Units (GPUs) in VMware Cloud Foundation (VCF).

Key Highlights:

  1. Understanding DPUs:
    • Offloading and Acceleration: DPUs enhance performance by offloading network and communication tasks from the CPU, allowing more efficient resource usage and better performance for data-heavy operations.
    • Enhanced Security: By isolating security tasks, DPUs contribute to a stronger zero-trust security model, essential for protecting modern cloud environments.
    • Dual DPU Support: This feature offers high availability and increased network offload capacity, simplifying infrastructure management and boosting resilience.
  2. Leveraging GPUs:
    • Accelerated AI and ML Workloads: GPUs in VMware environments significantly speed up data-intensive tasks like AI model training and inference.
    • Optimized Resource Utilization: VMware’s vSphere enables efficient GPU resource sharing through virtual GPU (vGPU) profiles, accommodating various workloads, including graphics, compute, and machine learning.
  3. Distributed Services Engine:
    • This engine simplifies infrastructure management and enhances performance by integrating DPU-based services, creating a more secure and efficient data center architecture.

Getting started with VCF 4.0 Part 3 – vSphere…

Getting started with VCF 4.0 Part 3 – vSphere with Kubernetes in a Workload Domain

Getting started with VCF 4.0 Part 3 – vSphere…

At this point, we have a fully configured workload domain which includes an NSX-T Edge deployment. Check here for the previous VCF 4.0 deployment steps. We are now ready to go ahead and deploy vSphere with Kubernetes, formerly known as Project Pacific. Via SDDC Manager in VMware Cloud Foundation 4.0, we ensure that an NSX-T Edge is available, and we also ensure that the the Workload Domain is sufficiently licensed to enable vSphere with Kubernetes. Disclaimer: “To be clear, this post is based…Read More


VMware Social Media Advocacy

Getting started with VCF 4.0 Part 2 –…

Getting started with VCF 4.0 Part 2 – Commission hosts, Create Workload Domain, Deploy NSX-T Edge

Getting started with VCF 4.0 Part 2 –…

Now that a VCF 4.0 Management Domain has been deployed, we can move onto creating our very first VCF 4.0 Virtual Infrastructure Workload Domain (VI WLD). We will require a VI WLD with an NSX-T Edge cluster before we can deploy Kubernetes on vSphere (formerly known as Project Pacific). Not too much has changed in the WLD creation workflow since version 3.9. We still have to commission ESXi hosts before we can create the WLD. But something different to previous versions of VCF is that today in…Read More


VMware Social Media Advocacy

Getting started with VMware Cloud Foundation…

Getting started with VMware Cloud Foundation (VCF) 4.0

Getting started with VMware Cloud Foundation…

On March 10th, VMware announced a range of new updated products and features. One of these was VMware Cloud Foundation (VCF) version 4.0. In the following series of blogs, I am going to show you the steps to deploy VCF 4.0. We will begin with the deployment of a Management Domain. Once this is complete, we will commission some additional hosts and build our first workload domain (WLD). After that, we will deploy an NSX-T 3.0 Edge Cluster to our Workload Domain. The great news here is that…Read More


VMware Social Media Advocacy