Unlocking Memory Efficiency with NVMe Memory Tiering in vSphere 9.0

Are your CPUs memory-starved while your infrastructure struggles with underutilization and growing costs? Enter Memory Tiering with NVMe — a groundbreaking feature in vSphere 9.0 that promises up to 40% lower TCO by intelligently managing your memory resources.

What Is Memory Tiering?

Memory tiering allows ESXi to use NVMe devices as a secondary memory tier, extending beyond traditional DRAM. By classifying memory pages as hot, warm, cold, or very cold, vSphere can dynamically move less frequently used pages to NVMe-backed memory. This unlocks better VM consolidation, more predictable performance, and optimized CPU usage.

Key Benefits

  • Cost Efficiency: Offloads cold pages from expensive DRAM to more affordable NVMe.
  • Better Utilization: Frees up to 30% of CPU cores for actual workloads.
  • Advanced Observability: Gain detailed visibility into DRAM and NVMe usage.
  • Resilient Architecture: Supports RAID, vMotion, DRS, and encryption at both VM and host level.

Who Should Use It?

Ideal for general workloads and tiered VMs, but not supported for latency-sensitive or passthrough-based VMs. Ensure your NVMe meets Broadcom’s vSAN compatibility requirements and configure the DRAM:NVMe ratio wisely (default is 1:1).

Summary

Memory tiering isn’t just a cool buzzword — it’s a strategic shift that aligns your infrastructure with modern performance and cost demands. Whether you’re scaling your VDI environment or looking to cut memory costs without compromising on performance, NVMe Memory Tiering in vSphere 9.0 is a game changer.

ESXCLI Commands for NVMe Memory Tiering – Commands Recap

DescriptionCommand
Check maintenance modeesxcli system maintenanceMode get
List storage devicesesxcli storage core adapter device list
Create NVMe tier deviceesxcli system tierdevice create -d <device> <vendor> <id>
List tier devicesesxcli system tierdevice list
Enable kernel memory tieringesxcli system settings kernel set -s MemoryTiering -v TRUE
Verify tiering statusesxcli system settings kernel list -o MemoryTiering
Reboot ESXireboot


    Accelerating AI Workloads: Mastering vGPU Management in VMware Environments

    Explore 2025 Session Recap – INVB1158LV

    Are you looking to maximize AI/ML performance in your virtualized environment? At VMware Explore 2025, I attended a compelling session — INVB1158LV: Accelerating AI Workloads: Mastering vGPU Management in VMware Environments — that unpacked how to effectively configure and scale GPUs for AI workloads in vSphere.

    This blog post shares key takeaways from the session and outlines how to use vGPU, MIG, and Passthrough to achieve optimal performance for AI inference and training on VMware Cloud Foundation 9.0.


    vGPU Configuration Options in VMware vSphere

    🔹 1. DirectPath I/O (Passthrough)

    • A dedicated GPU is assigned to a single VM or containerized workload.
    • Ideal for maximum performance and full GPU access (e.g., LLM training).
    • No sharing or resource fragmentation.

    🔹 2. NVIDIA vGPU – Time Slicing Mode

    • Shares one physical GPU across multiple VMs.
    • Each VM gets 100% of GPU cores for a slice of time, while memory is statically partitioned.
    • Supported on all NVIDIA GPUs.
    • Useful for efficient GPU sharing, especially for model inference and dev/test setups.

    ✅ Example profiles: grid_a100-8c, grid_a100-4-20c

    🔹 3. Multi-Instance GPU (MIG)

    • Available on NVIDIA Ampere & Hopper (e.g., A100, H100).
    • Splits GPU into isolated hardware slices (compute + memory).
    • Offers deterministic performance and better isolation.
    • Best for multi-tenant AI inference, production-grade deployments.

    ✅ Example profiles: MIG 1g.5gb, MIG 2g.10gb, MIG 3g.20gb
    ✅ Assignable via vSphere UI with profiles like grid_a100-3-20c


    Time Slicing vs. MIG – When to Use What?

    ModeBest ForSharing Type
    Time SlicingLLM training, dev/test environmentsTime-shared
    MIGProduction inference, multitenancySpatial (hardware)
    PassthroughMaximum performance for single workloadNot shared

    Smarter vMotion for AI Workloads in VCF 9.0

    One of the standout improvements presented during session INVB1158LV was the vMotion optimization for VMs using vGPUs. With vSphere 8.0 U3 and VMware Cloud Foundation 9.0, the way vMotion handles GPU memory has been completely reengineered to minimize downtime (stun time) during live migration.

    Instead of migrating all GPU memory during the VM stun phase, 70% of the vGPU cold data is now pre-copied in the pre-copy stage, and only the final 30% is checkpointed during stun. This greatly accelerates live migration even for massive LLM workloads running on multi-GPU systems.

    📊 Example results with Llama 3.1 models:

    • Migrating a VM using 2×H100 GPUs (144 GB vGPU memory) saw stun time drop from 24.5s to just 6.3s.
    • Migrating a large model on 8×H100 (576 GB) now completes in 21s, compared to 325s for a power-off-and-reload approach — that’s a 15× improvement.

    These enhancements make zero-downtime AI infrastructure upgrades and scaling possible, even for large language model deployments

    Deploying a Minimal VCF 9.0 Lab – Insights from Explore 2025

    I had the pleasure of attending the excellent session “Deploying Minimal VMware Cloud Foundation 9.0 Lab” by Alan Renouf and William Lam at VMware Explore 2025. It was packed with practical advice, hardware insights, and field-tested tips on how to stand up a fully functional VCF environment—even on a tight budget.

    Whether you’re a home lab enthusiast, enterprise architect, or just VCF-curious, here’s a recap of the key takeaways.


    Key Changes: VCF 5.x vs VCF 9.x

    VCF 5.x:

    • Required 4+ ESXi hosts
    • Monolithic installer
    • vSAN required
    • 3-node NSX cluster
    • 10GbE NICs mandatory

    VCF 9.x:

    • More modular design
    • Only 2–3 ESXi hosts required
    • 1 x 10GbE NIC sufficient
    • Support for singleton appliances
    • Flexible storage (vSAN ESA, FC, NFS)

    VCF 9.0 Tips & Tricks (with real CLI guidance)

    Here’s the juicy part—real-world deployment tips and overrides:

    1. Minimum ESXi Host Requirements

    • For vSAN/FC: 3 ESXi hosts
    • For NFS: 2 ESXi hosts
    • ⚠️ You can install VCF Installer + SDDC Manager even on a single ESXi host (great for nested labs!)
    > cat /home/vcf/feature.properties
    
    feature.vcf.internal.single.host.domain = true
    
    > echo 'y' | /opt/vmware/vcf/operationsmanager/scripts/cli/sddcmanager_restart_services.sh

    2. NIC Validation Bypass

    If your ESXi host doesn’t have a 10GbE NIC:

    > cat /etc/vmware/vcf/domainmanager/application.properties
    
    enable.speed.of.physical.nics.validation = false
    
    > echo 'y' | /opt/vmware/vcf/operationsmanager/scripts/cli/sddcmanager_restart_services.sh

    3. vSAN HCL Override

    VCF Installer will fail validation if your SSD or controller is not on the vSAN ESA HCL. Install a “mock” VIB to bypass:

    esxcli software vib install -v /tmp/vsan-mock.vib

    4. Offline Depot HTTPS Requirement

    By default, the VCF installer requires HTTPS:

    cat /opt/vmware/vcf/lcm/lcm-app/conf/application-prod.properties
    
    lcm.depot.adapter.httpsEnabled=false
    
    systemctl restart lcm

    5. Basic Auth Requirement

    You don’t need a full-blown web server:

    python http_server_auth.py --bind 192.168.1.100 --user myuser --password mysecurepassword --port 443 --directory /myrepo

    Reference Hardware for Minimal Lab

    Here’s an example BOM shared by the presenters:

    • MinisForum MS-A2 w/ AMD Ryzen 7945HX (16c/32t)
    • 128GB DDR5 (2x64GB SODIMM)
    • 3x M.2 NVMe SSDs
    • 10GbE SFP+ NIC + 2.5GbE onboard
    • MikroTik 5-port 10GbE switch (for under $200)

    This setup is small, powerful, and flexible enough for a complete VCF 9.0 deployment.


    Deployment Walkthrough – TL;DR

    Here’s the summarized 8-step flow:

    1. Install ESXi (kickstart from USB)
    2. Deploy VCF Installer VM
    3. Connect to Offline Depot
    4. Run Installer with JSON
    5. Configure vSAN ESA
    6. Deploy vCenter
    7. Update Storage Policies
    8. Deploy SDDC Manager, NSX, Fleet Manager, Automation, etc.

    Summary

    This session truly showcased how far VCF has come in terms of flexibility and accessibility. More info: VMware Cloud Foundation (VCF) 9.x in a Box.
    All trademarks belong to their respective owners.

    Data Services Manager (DSM) 9.0 Microsoft SQL…

    Data Services Manager (DSM) 9.0 Microsoft SQL…

    Data Services Manager 9.0 introduces support for a new Data Service, namely Microsoft SQL Server. This is currently tech preview and should be treated as non-production until full support is available. Customers can use this integration to deploy both MS SQL Server Instances and MS SQL Server […]


    Broadcom Social Media Advocacy

    Updated Nested ESXi 8.x & 9.0 Virtual Appliance

    Updated Nested ESXi 8.x & 9.0 Virtual Appliance

    Happy Sunday! Before the wave of announcements starts rolling out from VMware Explore Las Vegas, which starts tomorrow, I wanted to share a quick update. 😅 I have been pretty swamped for the past couple of months, so it has taken a bit more time to get the latest Nested ESXi Virtual Appliances […]


    Broadcom Social Media Advocacy

    VMware Explore 2025 in Las Vegas | Day 2 Recap

    From key announcements to community connections—Day 2 at VMware Explore 2025 in Las Vegas has delivered on big Ideas and bold Innovations. Learn more in the recap.

    VMware Explore 2025 in Las Vegas | Day 2 Recap

    Welcome to VMware Explore 2025, where thousands of explorers have gathered at The Venetian Convention and Expo Center to harness the power of private cloud. Reporting live from Nevada, here’s your recap of Day 2.


    Broadcom Social Media Advocacy

    How to deploy VVF/VCF 9.0 using VMUG Advantage…

    How to deploy VVF/VCF 9.0 using VMUG Advantage & VCP-VCF Certification Entitlement

    How to deploy VVF/VCF 9.0 using VMUG Advantage…

    VMUG Advantage members who have recently earned their VCP-VCF certification are entitled to non-commercial VMware Cloud Foundation (VCF) licenses for personal use. When logging into the Broadcom Certification Portal, you may have noticed that only perpetual licenses for VCF 5.x are available, […]


    Broadcom Social Media Advocacy