Security Enhancements in Cisco UCS Release 4.3(5a): Key Vulnerability Fixes

With the Cisco UCS 4.3(5a) release, Cisco addresses multiple critical security vulnerabilities impacting UCS Manager, Fabric Interconnects, and compute nodes. These updates are essential for maintaining secure infrastructure as they address several vulnerabilities within third-party software dependencies. Here’s a rundown of the significant security fixes provided in this release.


1. CSCwb81661: Vulnerabilities in OpenSSL

Cisco UCS Manager now includes critical fixes for three vulnerabilities related to OpenSSL, which, if exploited, could impact cryptographic functions, certificate parsing, and script handling. Here’s a closer look at each vulnerability:

  • CVE-2021-4160: A bug in the squaring procedure of MIPS32 and MIPS64 processors could potentially allow attackers to exploit Diffie-Hellman (DH) cryptographic operations, which affects scenarios with shared DH private keys. Cisco has mitigated this vulnerability by upgrading to fixed OpenSSL versions, addressing this issue in OpenSSL 1.1.1m, 3.0.1, and later releases.
  • CVE-2022-0778: The BN_mod_sqrt() function in OpenSSL could enter an infinite loop when parsing certain elliptic curve certificates, leading to a denial of service (DoS). This could be triggered in scenarios involving TLS and certificate handling. Cisco’s update mitigates this risk with fixes provided in OpenSSL 1.1.1n, 3.0.2, and later versions.
  • CVE-2022-1292: A vulnerability in the c_rehash script could allow attackers to inject commands through improperly sanitized shell metacharacters. This script, now considered obsolete, is replaced by OpenSSL’s rehash command-line tool, which Cisco has now included to remove this potential risk.

2. CSCwk62264: OpenSSH Security Regression (CVE-2024-6387)

Cisco UCS 6400 and 6500 Series Fabric Interconnects, when operated in UCS Manager mode, were vulnerable to a security regression in the OpenSSH server (sshd). This flaw, stemming from an older race condition vulnerability (CVE-2006-5051), could allow unauthenticated attackers to exploit the sshd signal handling.

Cisco’s Solution: Cisco has patched OpenSSH within the UCS software stack, reinforcing the security of SSH operations in Fabric Interconnects to eliminate this race condition vulnerability.


3. CSCwk62723: OpenSSH Vulnerability in Serial over LAN (SOL) on Blade Servers and Compute Nodes

The UCS B-Series Blade Servers and X-Series Compute Nodes were also affected by the same OpenSSH race condition issue. This vulnerability could impact serial-over-LAN connections, which are essential for remote console access.

Cisco’s Solution: The update fixes this vulnerability by integrating the latest OpenSSH patch, safeguarding remote management sessions on UCS blade servers and compute nodes.


4. CSCwk75392: libexpat Denial of Service Vulnerability (CVE-2023-52425)

This vulnerability in libexpat could lead to a denial of service (DoS) attack due to resource-intensive parsing of large tokens. Cisco UCS Manager, which relies on libexpat for XML parsing, could be affected in scenarios that involve extensive parsing requirements.

Cisco’s Solution: The upgraded libexpat version now included in UCS Manager resolves this DoS vulnerability, ensuring that XML parsing operations are resilient against such resource exhaustion attacks.

Links: Release Notes for Cisco UCS Manager, Release 4.3

Cisco UCS Release 4.3(4a) Update: PSU view displays Power: Error BUG CSCwj01478

Cisco’s Unified Computing System (UCS) continues to innovate in providing a powerful platform for data center infrastructure, but even with frequent updates, occasional bugs arise. In the recent Cisco UCS release 4.3(4a), a notable issue surfaced with the deployment of the Cisco UCSX-9508 chassis, leading to a power status error in the Cisco UCS Manager (UCSM) interface.

This issue, identified as BUG CSCwj01478, involves PSU view displays Power: Error and Input Source: Unknown in UCSM that do not reflect the actual physical status of the hardware.

Symptom Overview

Users deploying the Cisco UCSX-9508 chassis, integrated with Cisco UCS 6454 Fabric Interconnects (FI) and the Cisco UCS 9108 25G Intelligent Fabric Module (IFM), may encounter the following symptoms:

  • Power Display Issue: In UCSM, the chassis power status shows as “Power: Error” and “Input Source: Unknown,” even though the physical server PSU LEDs are green, indicating normal operation.
  • Persistent Faults: Major faults associated with the power status do not resolve within UCSM, even after attempts to decommission and re-acknowledge the chassis.

While these symptoms do not affect the physical functionality of the power supply, the discrepancy in UCSM’s display can complicate administrative monitoring and fault management workflows.

Summary: As a personal workaround, I set the FAN policy to MAX, which temporarily resolved the issue. However, the release of 4.3(4a) has now provided an official solution.

Links: Release Notes for Cisco UCS Manager, Release 4.3

Cisco UCS Manager Release 4.3(4a): New Optimized Adapter Policies for VIC Series Adapters

Starting with Cisco UCS Manager release 4.3(4a), Cisco has introduced optimized adapter policies for Windows, Linux, and VMware operating systems, including a new policy for VMware environments called “VMware-v2.” This update affects the Cisco UCS VIC 1400, 14000, and 15000 series adapters, promising improved performance and flexibility.

This release is particularly interesting for those managing VMware infrastructures, as many organizations—including ours—have been using similar settings for years. However, one notable difference is that the default configuration in the new policy sets Interrupts to 11, while in our environment, we’ve historically set it to 12.

Key Enhancements in UCS 4.3(4a)

  1. Optimized Adapter Policies: The new “VMware-v2” policy is tailored to enhance performance in VMware environments, specifically for the Cisco UCS VIC 1400, 14000, and 15000 adapters. It adjusts parameters such as the number of interrupts, queue depths, and receive/transmit buffers to achieve better traffic handling and lower latency.
  2. Receive Side Scaling (RSS): A significant feature available on the Cisco UCS VIC series is Receive Side Scaling (RSS). RSS is crucial for servers handling large volumes of network traffic as it allows the incoming network packets to be distributed across multiple CPU cores, enabling parallel processing. This distribution improves the overall throughput and reduces bottlenecks caused by traffic being handled by a single core. In high-performance environments like VMware, this can lead to a noticeable improvement in network performance. RSS is enabled on a per-vNIC basis, meaning administrators have granular control over which virtual network interfaces benefit from the feature. Given the nature of modern server workloads, enabling RSS on vNICs handling critical traffic can substantially improve performance, particularly in environments with multiple virtual machines.
  3. Maximizing Ring Size: Another important recommendation for administrators using the VIC 1400 adapters is to set the ringsize to the maximum, which for these adapters is 4096. The ring size determines how much data can be queued for processing by the NIC (Network Interface Card) before being handled by the CPU. A larger ring size allows for better performance, especially when dealing with bursts of high traffic.In environments where high throughput and low latency are critical, setting the ring size to its maximum value ensures that traffic can be handled more efficiently, reducing the risk of packet drops or excessive buffering.

Links:

Downfall CVE-2022-40982 – Intel’s Downfall Mitigation

Security Fixes in Release 4.3(2b)

Defect ID – CSCwf30468

Cisco UCS M5 C-series servers are affected by vulnerabilities identified by the following Common Vulnerability and Exposures (CVE) IDs:

  • CVE-2022-40982—Information exposure through microarchitectural state after transient execution in certain vector execution units for some Intel® Processors may allow an authenticated user to potentially enable information disclosure through local access
  • CVE-2022-43505—Insufficient control flow management in the BIOS firmware for some Intel® Processors may allow a privileged user to potentially enable denial of service through local access

https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/gather-data-sampling.html

Workaround EVC Intel “Broadwell” Generation or gather_data_sampling

wget https://raw.githubusercontent.com/speed47/spectre-meltdown-checker/master/spectre-meltdown-checker.sh# sh spectre-meltdown-checker.sh --variant downfall --explain

EVC Intel “Skylake” Generation

CVE-2022-40982 aka 'Downfall, gather data sampling (GDS)'
> STATUS:  VULNERABLE  (Your microcode doesn't mitigate the vulnerability, and your kernel doesn't support mitigation)
> SUMMARY: CVE-2022-40982:KO

EVC Intel “Broadwell” Generation

CVE-2022-40982 aka 'Downfall, gather data sampling (GDS)'
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)
> SUMMARY: CVE-2022-40982:OK

Mitigation with an updated kernel

When an update of the microcode is not available via a firmware update package, you may update the Kernel with a version that implements a way to shut off AVX instruction set support. It can be achieved by adding the following kernel command line parameter:

gather_data_sampling=force

Mitigation Options

When the mitigation is enabled, there is additional latency before results of the gather load can be consumed. Although the performance impact to most workloads is minimal, specific workloads may show performance impacts of up to 50%. Depending on their threat model, customers can decide to opt-out of the mitigation.

Intel® Software Guard Extensions (Intel® SGX)

There will be an Intel SGX TCB Recovery for those Intel SGX-capable affected processors. This TCB Recovery will only attest as up-to-date when the patch has been FIT-loaded (for example, with an updated BIOS), Intel SGX has been enabled by BIOS, and hyperthreading is disabled. In this configuration, the mitigation will be locked to the enabled state. If Intel SGX is not enabled or if hyperthreading is enabled, the mitigation will not be locked, and system software can choose to enable or disable the GDS mitigation.

Links:

Deprecation of legacy BIOS support in vSphere 8.0 (84233) + Booting vSphere ESXi 8.0 may fail with “Error 10 (Out of resources)” (89682)

UCSX-TPM2-002 Trusted Platform Module 2.0 for UCS servers

    Personally, here are the recommendations for new ESXi 8.0 installations:

    • VMware only supports UEFI boot in new installations
    • For the purchase of new servers, it is suitable with TPM 2.0
    • When upgrading to ESXi 8.0, verify that UEFI boot is enabled

    Booting vSphere ESXi 8.0 may fail with “Error 10 (Out of resources)” (89682)

    • Hardware machine is configured to boot in legacy BIOS mode.
    • Booting stops early in the boot process with messages displayed in red on black with wording similar to “Error 10 (Out of resources) while loading module”, “Requested malloc size failed”, or “No free memory”.
    “Error 10 (Out of resources) while loading module”, “Requested malloc size failed”, or “No free memory”

    VMware’s recommended workaround is to transition the machine to UEFI boot mode permanently, as discussed in KB article 84233 . There will not be a future ESXi change to allow legacy BIOS to work on this machine again.

    Deprecation of legacy BIOS support in vSphere (84233)

    VMware’s plans to deprecate support for legacy BIOS in server platforms.

    If you upgrade a server that was certified and running successfully with legacy BIOS to a newer release of ESXi, it is possible the server will no longer function with that release. For example, some servers may fail to boot with an “Out of resources” message because the newer ESXi release is too large to boot in legacy BIOS mode. Generally, VMware will not provide any fix or workaround for such issues besides either switching the server to UEFI

    Motivation

    UEFI provides several advantages over legacy BIOS and aligns with VMware goals for being “secure by default”. UEFI

    • UEFI Secure Boot, a security standard that helps ensure that the server boots using only software that is trusted by the server manufacturer.
    • Automatic update of the system boot order during ESXi installation.
    • Persistent memory
    • TPM 2.0
    • Intel SGX Registration
    • Upcoming support for DPU/SmartNIC
    Securing ESXi Hosts with Trusted Platform Module
    vSphere 6.7 Support for ESXi and TPM 2 0

    How to Boot ESXi 7.0 on UCS-M2-HWRAID Boot-Optimized M.2 RAID Controller

    VMware strongly advises that you move away completely from using SD card/USB as a boot device option on any future server hardware.

    SD cards can continue to be used for the bootbank partition provided that a separate persistent local device to store the OSDATA partition (32GB min., 128GB recommended) is available in the host.
    Preferably, the SD cards should be replaced with an M.2 or another local persistent device as the standalone boot option.

    vSphere 7 – ESXi System Storage Changes

    Please refer to the following blog:
    https://core.vmware.com/resource/esxi-system-storage-changes

    How to setup ESXi boot on UCS-M2-HWRAID ?

    Create Disk Group Policies – Storage / Storage Policies / root / Disk Group Policies / M.2-RAID1

    Create Storage Profile – Storage / Storage Profiles / root / Storage Profile M.2-RAID1

    Create Local LUNs – Storage / Storage Profiles / root / Storage Profile M.2-RAID1

    Modify Storage Profile inside Service Profile

    Change Boot Order to Local Disk

    Links

    NSX-T Edge design guide for Cisco UCS

    How to design NSX-T Edge inside Cisco UCS? I can’t find it inside Cisco Design Guide. But I find usefull topology inside Dell EMC VxBlock™ Systems, VMware® NSX-T Reference Design  and NSX-T 3.0 Edge Design Step-by-Step UI workflow. Thanks DELL and VMware …

    VMware® NSX-T Reference Design

    • VDS Design update – New capability of deploying NSX on top of VDS with NSX
    • VSAN Baseline Recommendation for Management and Edge Components
    • VRF Based Routing and other enhancements
    • Updated security functionality
    • Design changes that goes with VDS with NSX
    • Performance updates

    NSX-T 3.0 Edge Design Step-by-Step UI workflow

    This document is an informal document that walks through the step-by-step deployment and configuration workflow for NSX-T Edge Single N-VDS Multi-TEP design.  This document uses NSX-T 3.0 UI to show the workflow, which is broken down into following 3 sub-workflows:

    1. Deploy and configure the Edge node (VM & BM) with Single-NVDS Multi-TEP.
    2. Preparing NSX-T for Layer 2 External (North-South) connectivity.
    3. Preparing NSX-T for Layer 3 External (North-South) connectivity.

    NSX-T Design with EDGE VM

    • Under Teamings – Add 2 Teaming Policies: one with Active Uplink as “uplink-1” and other with “uplink-2”.
    • Make a note of the policy name used, as we would be using this in the next section. In this example they are “PIN-TO-TOR-LEFT” and “PIN-TO-TOR-RIGHT”.

    How to design NSX-T Edge inside Cisco UCS?

    Cisco Fabric Interconnect using Port Chanel. You need high bandwith for NSX-T Edge load.

    C220 M5 could solved it.

    The edge node physical NIC definition includes the following

    • VMNIC0 and VMNIC1: Cisco VIC 1457
    • VMNIC2 and VMNIC3: Intel XXV710 adapter 1 (TEP and Overlay)
    • VMNIC4 and VMNIC4: Intel XXV710 adapter 2 (N/S BGP Peering)
    NSX-T transport nodes with Cisco UCS C220 M5
    Logical topology of the physical edge host

    Or for PoC or Lab – Uplink Eth Interfaces

    For PoC od HomeLAB We could use Uplink Eth Interfaces and create vNIC template linked to these uplink.

    Links:

    Driver for Cisco nenic 1.0.35.0 – Enabled Geneve Offload support

    Cisco Virtual Interface Card Native Ethernet Driver 1.0.35.0 Enabled Geneve Offload support for VIC 14xx adapters.

    Bugs Fixed (since nenic 1.0.29.0):

    • CSCvw37990: Set Tx queue count to 1 in all cases except Netq
    • CSCvw39021: Add tx_budget mod param to nenic
    • CSCvo36323: Fix the issue of spurious drop counts with VIC 13XX in standalone rack servers.
    • CSCvq26550: Fix added in the driver to notify the VIC Firmware about any WQ/RQ errors.

    Dependencies:
    Cisco UCS Virtual Interface Card 1280 firmware version: 4.0
    Cisco UCS Virtual Interface Card 1240 firmware version: 4.0
    Cisco UCS Virtual Interface Card 1225 firmware version: 4.0
    Cisco UCS Virtual Interface Card 1225T firmware version: 4.0
    Cisco UCS Virtual Interface Card 1285 firmware version: 4.0
    Cisco UCS Virtual Interface Card 1380 firmware version: 4.0
    Cisco UCS Virtual Interface Card 1385 firmware version: 4.0
    Cisco UCS Virtual Interface Card 1387 firmware version: 4.0
    Cisco UCS Virtual Interface Card 1340 firmware version: 4.0
    Cisco UCS Virtual Interface Card 1227 firmware version: 4.0
    Cisco UCS Virtual Interface Card 1440 firmware version: 5.x
    Cisco UCS Virtual Interface Card 1455 firmware version: 5.x
    Cisco UCS Virtual Interface Card 1457 firmware version: 5.x
    Cisco UCS Virtual Interface Card 1480 firmware version: 5.x
    Cisco UCS Virtual Interface Card 1495 firmware version: 5.x
    Cisco UCS Virtual Interface Card 1497 firmware version: 5.x

    New Features:
    Enabled Geneve Offload support for VIC 14xx adapters

    More info:

    https://my.vmware.com/en/group/vmware/downloads/details?downloadGroup=DT-ESXI67-CISCO-NENIC-10350&productId=742

    Cisco UCS M5 Boot Time Enhancements

    How to speedup BOOT time in Cisco UCS M5?

    Adaptive Memory Training drop-down list

    When this token is enabled, the BIOS saves the memory training results (optimized timing/voltage values) along with CPU/memory configuration information and reuses them on subsequent reboots to save boot time. The saved memory training results are used only if the reboot happens within 24 hours of the last save operation. This can be one of the following:

    • Disabled—Adaptive Memory Training is disabled.
    • Enabled—Adaptive Memory Training is enabled.
    • Platform Default—The BIOS uses the value for this attribute contained in the BIOS defaults for the server type and vendor.

    BIOS Techlog Level

    Enabling this token allows the BIOS Tech log output to be controlled at more a granular level. This reduces the number of BIOS Tech log messages that are redundant, or of little use. This can be one of the following:

    This option denotes the type of messages in BIOS tech log file. The log file can be one of the following types:

    • Minimum – Critical messages will be displayed in the log file.
    • Normal – Warning and loading messages will be displayed in the log file.
    • Maximum – Normal and information related messages will be displayed in the log file.

    Note: This option is mainly for internal debugging purposes.

    Note: To disable the Fast Boot option, the end user must set the following tokens as mentioned below:

    OptionROM Launch Optimization

    The Option ROM launch is controlled at the PCI Slot level, and is enabled by default. In configurations that consist of a large number of network controllers and storage HBAs having Option ROMs, all the Option ROMs may get launched if the PCI Slot Option ROM Control is enabled for all. However, only a subset of controllers may be used in the boot process. When this token is enabled, Option ROMs are launched only for those controllers that are present in boot policy. This can be one of the following:

    • Disabled—OptionROM Launch Optimization is disabled.
    • Enabled—OptionROM Launch Optimization is enabled.
    • Platform Default—The BIOS uses the value for this attribute contained in the BIOS defaults for the server type and vendor.

    Results

    First BOOT after New settings is longer about 1-2 minutes.

    Then We can save about 2 minutes on each BOOT from Second BOOT with 3TB RAM B480M5:

    Multiple-NIC vMotion tunning 2x 40Gbps

    Because for Monster SAP HANA VM (1-3 TB RAM) I tuned several AdvSystemSettings.

    In the end I was able to speedup vMotion 4x times and utilize 2x flow with 40 Gbps – VIC 1340 with PE.

    Inspiration was:

    It is in production from 04/2018, My tuned final settings is:

    AdvSystemSettingsDefaultTunningDesc
    Migrate.VMotionStreamHelpers08Number of helpers to allocate for VMotion streams
    Net.NetNetqTxPackKpps300600Max TX queue load (in thousand packet per second) to allow packing on the corresponding RX queue
    Net.NetNetqTxUnpackKpps6001200Threshold (in thousand packet per second) for TX queue load to trigger unpacking of the corresponding RX queue
    Net.MaxNetifTxQueueLen200010000Maximum length of the Tx queue for the physical NICs – toto stačí pro urychlení VM komunikace