This video is Part 2 of the VMware Cloud Foundation Capacity Management series that shows you how you can optimize your environment’s capacity with reclamation and rightsizing tools. • Series Objectives – 00:27 • Reclaim and Rightsize Overview – 01:35 • Reclaim Demo – 06:43 • Modifying Reclamation Settings – 07:42 • Rightsize Demo – 08:03
This video is Part 1 of the VMware Cloud Foundation Capacity Management series that shows you how you can get an immediate assessment of your environment’s capacity needs.
If you own a Minisforums MS-A2, you may have noticed that the system can run quite warm under load. I certainly have been putting my MS-A2 to work by running VMware Cloud Foundation (VCF) 9.0 and I have been trying various experiments in seeing how I can reduce both the thermals and thus the fan […]
Discover how NVMe memory tiering in VMware vSphere 9.0 helps reduce memory costs, improve VM consolidation, and optimize CPU utilization. Learn the benefits, use cases, and key ESXCLI monitoring commands for efficient tiered memory management.
During VMware Explore, I had a request from an attendee who was interested in my physical networking and how it is all connected for my VMware Cloud Foundation (VCF) 9.0 Lab setup. The network diagram below is based on the following VCF 9.0 Hardware BOM (Build-of-Material) and using the follow-[…]
Discover the biggest VMware Explore 2025 announcements on GPU virtualization. Learn how vSphere 9.0 introduces vGPU management, MIG support, NVLink scaling, GPU-aware vMotion, and new monitoring tools to accelerate AI workloads at scale.
Are your CPUs memory-starved while your infrastructure struggles with underutilization and growing costs? Enter Memory Tiering with NVMe — a groundbreaking feature in vSphere 9.0 that promises up to 40% lower TCO by intelligently managing your memory resources.
What Is Memory Tiering?
Memory tiering allows ESXi to use NVMe devices as a secondary memory tier, extending beyond traditional DRAM. By classifying memory pages as hot, warm, cold, or very cold, vSphere can dynamically move less frequently used pages to NVMe-backed memory. This unlocks better VM consolidation, more predictable performance, and optimized CPU usage.
Key Benefits
Cost Efficiency: Offloads cold pages from expensive DRAM to more affordable NVMe.
Better Utilization: Frees up to 30% of CPU cores for actual workloads.
Advanced Observability: Gain detailed visibility into DRAM and NVMe usage.
Resilient Architecture: Supports RAID, vMotion, DRS, and encryption at both VM and host level.
Who Should Use It?
Ideal for general workloads and tiered VMs, but not supported for latency-sensitive or passthrough-based VMs. Ensure your NVMe meets Broadcom’s vSAN compatibility requirements and configure the DRAM:NVMe ratio wisely (default is 1:1).
Summary
Memory tiering isn’t just a cool buzzword — it’s a strategic shift that aligns your infrastructure with modern performance and cost demands. Whether you’re scaling your VDI environment or looking to cut memory costs without compromising on performance, NVMe Memory Tiering in vSphere 9.0 is a game changer.
ESXCLI Commands for NVMe Memory Tiering – Commands Recap
Description
Command
Check maintenance mode
esxcli system maintenanceMode get
List storage devices
esxcli storage core adapter device list
Create NVMe tier device
esxcli system tierdevice create -d <device> <vendor> <id>
List tier devices
esxcli system tierdevice list
Enable kernel memory tiering
esxcli system settings kernel set -s MemoryTiering -v TRUE
Verify tiering status
esxcli system settings kernel list -o MemoryTiering
Are you looking to maximize AI/ML performance in your virtualized environment? At VMware Explore 2025, I attended a compelling session — INVB1158LV: Accelerating AI Workloads: Mastering vGPU Management in VMware Environments — that unpacked how to effectively configure and scale GPUs for AI workloads in vSphere.
This blog post shares key takeaways from the session and outlines how to use vGPU, MIG, and Passthrough to achieve optimal performance for AI inference and training on VMware Cloud Foundation 9.0.
vGPU Configuration Options in VMware vSphere
🔹 1. DirectPath I/O (Passthrough)
A dedicated GPU is assigned to a single VM or containerized workload.
Ideal for maximum performance and full GPU access (e.g., LLM training).
No sharing or resource fragmentation.
🔹 2. NVIDIA vGPU – Time Slicing Mode
Shares one physical GPU across multiple VMs.
Each VM gets 100% of GPU cores for a slice of time, while memory is statically partitioned.
Supported on all NVIDIA GPUs.
Useful for efficient GPU sharing, especially for model inference and dev/test setups.
✅ Example profiles: grid_a100-8c, grid_a100-4-20c
🔹 3. Multi-Instance GPU (MIG)
Available on NVIDIA Ampere & Hopper (e.g., A100, H100).
Splits GPU into isolated hardware slices (compute + memory).
Offers deterministic performance and better isolation.
Best for multi-tenant AI inference, production-grade deployments.
✅ Example profiles: MIG 1g.5gb, MIG 2g.10gb, MIG 3g.20gb ✅ Assignable via vSphere UI with profiles like grid_a100-3-20c
Time Slicing vs. MIG – When to Use What?
Mode
Best For
Sharing Type
Time Slicing
LLM training, dev/test environments
Time-shared
MIG
Production inference, multitenancy
Spatial (hardware)
Passthrough
Maximum performance for single workload
Not shared
Smarter vMotion for AI Workloads in VCF 9.0
One of the standout improvements presented during session INVB1158LV was the vMotion optimization for VMs using vGPUs. With vSphere 8.0 U3 and VMware Cloud Foundation 9.0, the way vMotion handles GPU memory has been completely reengineered to minimize downtime (stun time) during live migration.
Instead of migrating all GPU memory during the VM stun phase, 70% of the vGPU cold data is now pre-copied in the pre-copy stage, and only the final 30% is checkpointed during stun. This greatly accelerates live migration even for massive LLM workloads running on multi-GPU systems.
📊 Example results with Llama 3.1 models:
Migrating a VM using 2×H100 GPUs (144 GB vGPU memory) saw stun time drop from 24.5s to just 6.3s.
Migrating a large model on 8×H100 (576 GB) now completes in 21s, compared to 325s for a power-off-and-reload approach — that’s a 15× improvement.
These enhancements make zero-downtime AI infrastructure upgrades and scaling possible, even for large language model deployments
I had the pleasure of attending the excellent session “Deploying Minimal VMware Cloud Foundation 9.0 Lab” by Alan Renouf and William Lam at VMware Explore 2025. It was packed with practical advice, hardware insights, and field-tested tips on how to stand up a fully functional VCF environment—even on a tight budget.
Whether you’re a home lab enthusiast, enterprise architect, or just VCF-curious, here’s a recap of the key takeaways.
Key Changes: VCF 5.x vs VCF 9.x
VCF 5.x:
Required 4+ ESXi hosts
Monolithic installer
vSAN required
3-node NSX cluster
10GbE NICs mandatory
VCF 9.x:
More modular design
Only 2–3 ESXi hosts required
1 x 10GbE NIC sufficient
Support for singleton appliances
Flexible storage (vSAN ESA, FC, NFS)
VCF 9.0 Tips & Tricks (with real CLI guidance)
Here’s the juicy part—real-world deployment tips and overrides:
1. Minimum ESXi Host Requirements
For vSAN/FC: 3 ESXi hosts
For NFS: 2 ESXi hosts
⚠️ You can install VCF Installer + SDDC Manager even on a single ESXi host (great for nested labs!)
This setup is small, powerful, and flexible enough for a complete VCF 9.0 deployment.
Deployment Walkthrough – TL;DR
Here’s the summarized 8-step flow:
Install ESXi (kickstart from USB)
Deploy VCF Installer VM
Connect to Offline Depot
Run Installer with JSON
Configure vSAN ESA
Deploy vCenter
Update Storage Policies
Deploy SDDC Manager, NSX, Fleet Manager, Automation, etc.
Summary
This session truly showcased how far VCF has come in terms of flexibility and accessibility. More info: VMware Cloud Foundation (VCF) 9.x in a Box. All trademarks belong to their respective owners.