Daniel Micanek virtual Blog – Like normal Dan, but virtual.
Category: NSX
The “VMware NSX” blog category focuses on VMware NSX, an advanced network virtualization and security platform developed by VMware. This blog category offers extensive information and expertise on VMware NSX, including detailed guides, tutorials, technical analyses, and the latest updates.
We’re excited to announce the general availability of VMware NSX-T 3.2, one of the largest NSX releases so far. NSX-T 3.2 includes key innovations across multi-cloud security, scale-out networking for containers, VMs, and physical workloads. It also delivers simplified operations that help enterprises achieve a one-click, public cloud experience wherever their workloads are deployed. Strong Multi-Cloud Security NSX-T 3.2 provides strong, multi-cloud, easy-to-operationalize network defenses…Read More
Well here we are again – another VMworld has come around. As most of you will know, VMworld 2021 is going to be fully virtual event. Here are My Top 10 Sessions:
Simplify Network Consumption and Automation for Day 1 and Day 2 Operations [NET2185]
Apps are the lifeblood of the business in today’s digital economy. Can you envision seamless connectivity between your apps & networking technologies?
Learn why apps are the lifeblood of the business in today’s digital economy
See how seamless connectivity between your apps and networking technologies for end-to-end network configuration, compliance and automation with a single API call can be possible
Find out how to simplify network consumption with network automation for Day 1 and Day 2 operations
With the expansion of VMware’s security portfolio, you may be wondering how all of the pieces fit together into a cohesive multi-cloud enterprise design
Learn about VMware’s expanding security portfolio and how all the tools fit together
See how cyberattacks occur, including a look into the most famous attack of all time!
Gain a better understanding of how to design a policy-driven, defense-in-depth strategy for the modern enterprise
After you finish the prerequisites for upgrading, your next step is to update the upgrade coordinator to initiate the upgrade process.
Trending support issues in VMware NSX (2131154) – This article provides information regarding trending issues and important Knowledge Base articles regarding VMware NSX Data Center for vSphere 6.x and VMware NSX-T Data Center 3.x.
The upgrade coordinator runs in the NSX Manager. It is a self-contained web application that orchestrates the upgrade process of hosts, NSX Edge cluster, NSX Controller cluster, and Management plane.
After the upgrade coordinator has been upgraded, based on your input, the upgrade coordinator updates the NSX Edge cluster, hosts, and the Management plane. Edge upgrade unit groups consist of NSX Edge nodes that are part of the same NSX Edge cluster. You can reorder Edge upgrade unit groups and enable or disable an Edge upgrade unit group from the upgrade sequence.
The upgrade sequence upgrades the Management Plane at the end. When the Management Plane upgrade is in progress, avoid any configuration changes from any of the nodes.
NSX-T Data Center Upgrade Checklist
Task
Instructions
Review the known upgrade problems and workaround documented in the NSX-T Data Center release notes.
See the NSX-T Data Center Release Notes.
Follow the system configuration requirements and prepare your infrastructure.
See the system requirements section of the NSX-T Data Center Installation Guide.
The NSX Edge nodes are upgraded in serial mode so that when the upgrading node is down, the other nodes in the NSX Edge cluster remain active to continuously forward traffic.
Enter the NSX Edge cluster upgrade plan details:
Option
Description
Serial
Upgrade all the Edge upgrade unit groups consecutively.
Parallel
Upgrade all the Edge upgrade unit groups simultaneously.
When an upgrade unit fails to upgrade
You can fix an error on the Edge node and continue the upgrade. You cannot deselect this setting.
After each group completes
Select to pause the upgrade process after each Edge upgrade unit group finishes upgrading.
Click Start to upgrade the NSX Edge cluster.
Click Start to upgrade the NSX Edge cluster
Monitor the upgrade process.
You can view the overall upgrade status and progress details of each Edge upgrade unit group. The upgrade duration depends on the number of Edge upgrade unit groups you have in your environment.
Or using CLI get upgrade progress-status
IN_PROGRESS:
tanzu-nsx-edge> get upgrade progress-status
*********************************************************************
Node Upgrade has been started. Please do not make any changes, until
the upgrade operation is complete. Run "get upgrade progress-status"
to show the progress of last upgrade step.
*********************************************************************
Tue May 18 2021 UTC 21:51:51.235
Upgrade info:
From-version: 3.1.1.0.0.17483065
To-version: 3.1.2.0.0.17883603
Upgrade steps:
11-preinstall-enter_maintenance_mode [2021-05-18 21:45:34 - 2021-05-18 21:46:45] SUCCESS
download_os [2021-05-18 21:47:34 - 2021-05-18 21:49:01] SUCCESS
install_os [2021-05-18 21:49:02 - ] IN_PROGRESS
SUCCESS / Upgrade is not in progress:
tanzu-nsx-edge> get upgrade progress-status
Node Upgrade has been started. Please do not make any changes, until
the upgrade operation is complete. Run "get upgrade progress-status"
to show the progress of last upgrade step.
Tue May 18 2021 UTC 21:55:55.278
Upgrade info:
From-version: 3.1.1.0.0.17483065
To-version: 3.1.2.0.0.17883603
Upgrade steps:
11-preinstall-enter_maintenance_mode [2021-05-18 21:45:34 - 2021-05-18 21:46:45] SUCCESS
download_os [2021-05-18 21:47:34 - 2021-05-18 21:49:01] SUCCESS
install_os [2021-05-18 21:49:02 - 2021-05-18 21:51:58] SUCCESS
switch_os [2021-05-18 21:52:01 - 2021-05-18 21:52:24] SUCCESS
reboot [2021-05-18 21:52:28 - 2021-05-18 21:53:53] SUCCESS
migrate_users [2021-05-18 21:55:26 - 2021-05-18 21:55:31] SUCCESS
tanzu-nsx-edge> get upgrade progress-status
Tue May 18 2021 UTC 21:59:41.501
Upgrade is not in progress
Click Run Post Checks to verify whether the Edge upgrade unit groups were successfully upgraded.
You can customize the upgrade sequence of the hosts, disable certain hosts from the upgrade, or pause the upgrade at various stages of the upgrade process.
Option
Description
Serial
Upgrade all the host upgrade unit groups consecutively. This selection is useful to maintain the step-by-step upgrade of the host components. .
Parallel
Upgrade all the host upgrade unit groups simultaneously. You can upgrade up to five hosts simultaneously.
When an upgrade unit fails to upgrade
Select to pause the upgrade process if any host upgrade fails.
After each group completes
Select to pause the upgrade process after each host upgrade unit group finishes upgrading.
Manage Host Upgrade Unit Groups
Hosts in a ESXi cluster appear in one host upgrade unit group in the upgrade coordinator. You can move these hosts from one host upgrade unit group to another host upgrade unit group.
Upgrade Hosts
Click Start to upgrade the hosts.
Click Refresh and monitor the upgrade process.
Click Run Post Checks to make sure that the upgraded hosts and NSX-T Data Center do not have any problems.
BEFORE the upgrade, verify the version of NSX-T Data Center packages
More info in the NSX-T Data Center Administration Guide.
Click Start to upgrade the Management plane
Accept the upgrade notification
Monitor the upgrade progress from the NSX Manager CLI for the orchestrator node.
tanzu-nsx> get upgrade progress-status
Node Upgrade has been started. Please do not make any changes, until
the upgrade operation is complete. Run "get upgrade progress-status"
to show the progress of last upgrade step.
Tue May 18 2021 UTC 22:38:49.533
Upgrade info:
From-version: 3.1.1.0.0.17483186
To-version: 3.1.2.0.0.17883600
Upgrade steps:
download_os [2021-05-18 22:35:57 - 2021-05-18 22:38:04] SUCCESS
shutdown_manager [2021-05-18 22:38:15 - ] IN_PROGRESS
Status: cloud-init status command returned output active.
Some appliance components are not functioning properly.
Component health: MANAGER:UNKNOWN, POLICY:UNKNOWN, SEARCH:UNKNOWN, NODE_MGMT:UP, UI:UP.
Error code: 101
tanzu-nsx> get upgrade progress-status
Node Upgrade has been started. Please do not make any changes, until
the upgrade operation is complete. Run "get upgrade progress-status"
to show the progress of last upgrade step.
Tue May 18 2021 UTC 23:09:58.930
Upgrade info:
From-version: 3.1.1.0.0.17483186
To-version: 3.1.2.0.0.17883600
Upgrade steps:
download_os [2021-05-18 22:35:57 - 2021-05-18 22:38:04] SUCCESS
shutdown_manager [2021-05-18 22:38:15 - 2021-05-18 22:56:22] SUCCESS
install_os [2021-05-18 22:56:22 - 2021-05-18 22:58:34] SUCCESS
migrate_manager_config [2021-05-18 22:58:34 - 2021-05-18 22:58:44] SUCCESS
switch_os [2021-05-18 22:58:44 - 2021-05-18 22:59:06] SUCCESS
reboot [2021-05-18 22:59:06 - 2021-05-18 23:00:04] SUCCESS
run_migration_tool [2021-05-18 23:03:11 - 2021-05-18 23:03:16] SUCCESS
start_manager [2021-05-18 23:03:16 - ] IN_PROGRESS
tanzu-nsx> get upgrade progress-status
Tue May 18 2021 UTC 23:21:33.659
Upgrade is not in progress
I upgraded vCenter to version 7 successfully but failed when it came to updating my hosts from 6.7 to 7.
I got some warning stating PCI devices were incompatible but tried anyways. Turns out that failed, my Mellanox ConnectX 2 wasn’t showing up as an available physical NIC.
At first It was necessary to have VID/DID device code for MT26448 [ConnectX EN 10GigE , PCIe 2.0 5GT/s].
Partner
Product
Driver
VID
DID
Mellanox
MT26448 [ConnectX EN 10GigE , PCIe 2.0 5GT/s]
mlx4_core
15b3
6750
Whole table We could check here or search mlx to see all Mellanox cards list.
Deprecated devices supported by VMKlinux drivers
Devices that were only supported in 6.7 or earlier by a VMKlinux inbox driver. These devices are no longer supported because all support for VMKlinux drivers and their devices have been completely removed in 7.0.
*********************************************************************
/vmfs/volumes/ISO/tmp-network/etc/vmware/default.map.d/nmlx4_core.map
*********************************************************************
regtype=native,bus=pci,id=15b301f6..............,driver=nmlx4_core
regtype=native,bus=pci,id=15b301f8..............,driver=nmlx4_core
regtype=native,bus=pci,id=15b31003..............,driver=nmlx4_core
regtype=native,bus=pci,id=15b31004..............,driver=nmlx4_core
regtype=native,bus=pci,id=15b31007..............,driver=nmlx4_core
regtype=native,bus=pci,id=15b3100715b30003......,driver=nmlx4_core
regtype=native,bus=pci,id=15b3100715b30006......,driver=nmlx4_core
regtype=native,bus=pci,id=15b3100715b30007......,driver=nmlx4_core
regtype=native,bus=pci,id=15b3100715b30008......,driver=nmlx4_core
regtype=native,bus=pci,id=15b3100715b3000c......,driver=nmlx4_core
regtype=native,bus=pci,id=15b3100715b3000d......,driver=nmlx4_core
regtype=native,bus=pci,id=15b36750..............,driver=nmlx4_core
------------------------->Last Line is FIX
And add HW ID support in file nmlx4_core.ids:
**************************************************************************************
/vmfs/volumes/FreeNAS/ISO/tmp-network/usr/share/hwdata/default.pciids.d/nmlx4_core.ids
**************************************************************************************
#
# This file is mechanically generated. Any changes you make
# manually will be lost at the next build.
#
# Please edit <driver>_devices.py file for permanent changes.
#
# Vendors, devices and subsystems.
#
# Syntax (initial indentation must be done with TAB characters):
#
# vendor vendor_name
# device device_name <-- single TAB
# subvendor subdevice subsystem_name <-- two TABs
15b3 Mellanox Technologies
01f6 MT27500 [ConnectX-3 Flash Recovery]
01f8 MT27520 [ConnectX-3 Pro Flash Recovery]
1003 MT27500 Family [ConnectX-3]
1004 MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]
1007 MT27520 Family [ConnectX-3 Pro]
15b3 0003 ConnectX-3 Pro VPI adapter card; dual-port QSFP; FDR IB (56Gb/s) and 40GigE (MCX354A-FCC)
15b3 0006 ConnectX-3 Pro EN network interface card 40/56GbE dual-port QSFP(MCX314A-BCCT )
15b3 0007 ConnectX-3 Pro EN NIC; 40GigE; dual-port QSFP (MCX314A-BCC)
15b3 0008 ConnectX-3 Pro VPI adapter card; single-port QSFP; FDR IB (56Gb/s) and 40GigE (MCX353A-FCC)
15b3 000c ConnectX-3 Pro EN NIC; 10GigE; dual-port SFP+ (MCX312B-XCC)
15b3 000d ConnectX-3 Pro EN network interface card; 10GigE; single-port SFP+ (MCX311A-XCC)
6750 Mellanox ConnectX-2 Dual Port 10GbE
-------->Last Line is FIX
After reboot I could see support for MT26448 [ConnectX EN 10GigE , PCIe 2.0 5GT/s].
Only ALERT: Failed to verify signatures of the following vib(s): [nmlx4-core].
2020-XX-XXTXX:XX:44.473Z cpu0:2097509)ALERT: Failed to verify signatures of the following vib(s): [nmlx4-core]. All tardisks validated
2020-XX-XXTXX:XX:47.909Z cpu1:2097754)Loading module nmlx4_core ...
2020-XX-XXTXX:XX:47.912Z cpu1:2097754)Elf: 2052: module nmlx4_core has license BSD
2020-XX-XXTXX:XX:47.921Z cpu1:2097754)<NMLX_INF> nmlx4_core: init_module called
2020-XX-XXTXX:XX:47.921Z cpu1:2097754)Device: 194: Registered driver 'nmlx4_core' from 42
2020-XX-XXTXX:XX:47.921Z cpu1:2097754)Mod: 4845: Initialization of nmlx4_core succeeded with module ID 42.
2020-XX-XXTXX:XX:47.921Z cpu1:2097754)nmlx4_core loaded successfully.
2020-XX-XXTXX:XX:47.951Z cpu1:2097754)<NMLX_INF> nmlx4_core: 0000:05:00.0: nmlx4_core_Attach - (nmlx4_core_main.c:2476) running
2020-XX-XXTXX:XX:47.951Z cpu1:2097754)DMA: 688: DMA Engine 'nmlx4_core' created using mapper 'DMANull'.
2020-XX-XXTXX:XX:47.951Z cpu1:2097754)DMA: 688: DMA Engine 'nmlx4_core' created using mapper 'DMANull'.
2020-XX-XXTXX:XX:47.951Z cpu1:2097754)DMA: 688: DMA Engine 'nmlx4_core' created using mapper 'DMANull'.
2020-XX-XXTXX:XX:49.724Z cpu1:2097754)<NMLX_INF> nmlx4_core: 0000:05:00.0: nmlx4_ChooseRoceMode - (nmlx4_core_main.c:382) Requested RoCE mode RoCEv1
2020-XX-XXTXX:XX:49.724Z cpu1:2097754)<NMLX_INF> nmlx4_core: 0000:05:00.0: nmlx4_ChooseRoceMode - (nmlx4_core_main.c:422) Requested RoCE mode is supported - choosing RoCEv1
2020-XX-XXTXX:XX:49.934Z cpu1:2097754)<NMLX_INF> nmlx4_core: 0000:05:00.0: nmlx4_CmdInitHca - (nmlx4_core_fw.c:1408) Initializing device with B0 steering support
2020-XX-XXTXX:XX:50.561Z cpu1:2097754)<NMLX_INF> nmlx4_core: 0000:05:00.0: nmlx4_InterruptsAlloc - (nmlx4_core_main.c:1744) Granted 38 MSIX vectors
2020-XX-XXTXX:XX:50.561Z cpu1:2097754)<NMLX_INF> nmlx4_core: 0000:05:00.0: nmlx4_InterruptsAlloc - (nmlx4_core_main.c:1766) Using MSIX
2020-XX-XXTXX:XX:50.781Z cpu1:2097754)Device: 330: Found driver nmlx4_core for device 0xxxxxxxxxxxxxxxxxxxxxxx
Some 10 Gbps tuning testing looks great, between 2x ESXi 7.0 with 2x MT2644:
Support for RoCEv2 is above card – Mellanox ConnectX-3 Pro
We can see RoCEv2 options in nmlx2_core driver, but when I enabled enable_rocev2 It is NOT working
[root@esxi~] esxcli system module parameters list -m nmlx4_core
Name Type Value Description
---------------------- ---- ----- -----------
enable_64b_cqe_eqe int Enable 64 byte CQEs/EQEs when the the FW supports this
enable_dmfs int Enable Device Managed Flow Steering
enable_qos int Enable Quality of Service support in the HCA
enable_rocev2 int Enable RoCEv2 mode for all devices
enable_vxlan_offloads int Enable VXLAN offloads when supported by NIC
log_mtts_per_seg int Log2 number of MTT entries per segment
log_num_mgm_entry_size int Log2 MGM entry size, that defines the number of QPs per MCG, for example: value 10 results in 248 QP per MGM entry
msi_x int Enable MSI-X
mst_recovery int Enable recovery mode(only NMST module is loaded)
rocev2_udp_port int Destination port for RoCEv2
It is officialy NOT supported. Use it only in your HomeLAB. But We could save some money for new 10Gbps network cards.