UCS Archives

Security Enhancements in Cisco UCS Release 4.3(5a): Key Vulnerability Fixes

With the Cisco UCS 4.3(5a) release, Cisco addresses multiple critical security vulnerabilities impacting UCS Manager, Fabric Interconnects, and compute nodes. These updates are essential for maintaining secure infrastructure as they address several vulnerabilities within third-party software dependencies. Here’s a rundown of the significant security fixes provided in this release.

1. CSCwb81661: Vulnerabilities in OpenSSL

Cisco UCS Manager now includes critical fixes for three vulnerabilities related to OpenSSL, which, if exploited, could impact cryptographic functions, certificate parsing, and script handling. Here’s a closer look at each vulnerability:

CVE-2021-4160: A bug in the squaring procedure of MIPS32 and MIPS64 processors could potentially allow attackers to exploit Diffie-Hellman (DH) cryptographic operations, which affects scenarios with shared DH private keys. Cisco has mitigated this vulnerability by upgrading to fixed OpenSSL versions, addressing this issue in OpenSSL 1.1.1m, 3.0.1, and later releases.
CVE-2022-0778: The BN_mod_sqrt() function in OpenSSL could enter an infinite loop when parsing certain elliptic curve certificates, leading to a denial of service (DoS). This could be triggered in scenarios involving TLS and certificate handling. Cisco’s update mitigates this risk with fixes provided in OpenSSL 1.1.1n, 3.0.2, and later versions.
CVE-2022-1292: A vulnerability in the c_rehash script could allow attackers to inject commands through improperly sanitized shell metacharacters. This script, now considered obsolete, is replaced by OpenSSL’s rehash command-line tool, which Cisco has now included to remove this potential risk.

2. CSCwk62264: OpenSSH Security Regression (CVE-2024-6387)

Cisco UCS 6400 and 6500 Series Fabric Interconnects, when operated in UCS Manager mode, were vulnerable to a security regression in the OpenSSH server (sshd). This flaw, stemming from an older race condition vulnerability (CVE-2006-5051), could allow unauthenticated attackers to exploit the sshd signal handling.

Cisco’s Solution: Cisco has patched OpenSSH within the UCS software stack, reinforcing the security of SSH operations in Fabric Interconnects to eliminate this race condition vulnerability.

3. CSCwk62723: OpenSSH Vulnerability in Serial over LAN (SOL) on Blade Servers and Compute Nodes

The UCS B-Series Blade Servers and X-Series Compute Nodes were also affected by the same OpenSSH race condition issue. This vulnerability could impact serial-over-LAN connections, which are essential for remote console access.

Cisco’s Solution: The update fixes this vulnerability by integrating the latest OpenSSH patch, safeguarding remote management sessions on UCS blade servers and compute nodes.

4. CSCwk75392: libexpat Denial of Service Vulnerability (CVE-2023-52425)

This vulnerability in libexpat could lead to a denial of service (DoS) attack due to resource-intensive parsing of large tokens. Cisco UCS Manager, which relies on libexpat for XML parsing, could be affected in scenarios that involve extensive parsing requirements.

Cisco’s Solution: The upgraded libexpat version now included in UCS Manager resolves this DoS vulnerability, ensuring that XML parsing operations are resilient against such resource exhaustion attacks.

Links: Release Notes for Cisco UCS Manager, Release 4.3

Cisco UCS Release 4.3(4a) Update: PSU view displays Power: Error BUG CSCwj01478

Cisco’s Unified Computing System (UCS) continues to innovate in providing a powerful platform for data center infrastructure, but even with frequent updates, occasional bugs arise. In the recent Cisco UCS release 4.3(4a), a notable issue surfaced with the deployment of the Cisco UCSX-9508 chassis, leading to a power status error in the Cisco UCS Manager (UCSM) interface.

This issue, identified as BUG CSCwj01478, involves PSU view displays Power: Error and Input Source: Unknown in UCSM that do not reflect the actual physical status of the hardware.

Symptom Overview

Users deploying the Cisco UCSX-9508 chassis, integrated with Cisco UCS 6454 Fabric Interconnects (FI) and the Cisco UCS 9108 25G Intelligent Fabric Module (IFM), may encounter the following symptoms:

Power Display Issue: In UCSM, the chassis power status shows as “Power: Error” and “Input Source: Unknown,” even though the physical server PSU LEDs are green, indicating normal operation.
Persistent Faults: Major faults associated with the power status do not resolve within UCSM, even after attempts to decommission and re-acknowledge the chassis.

While these symptoms do not affect the physical functionality of the power supply, the discrepancy in UCSM’s display can complicate administrative monitoring and fault management workflows.

Summary: As a personal workaround, I set the FAN policy to MAX, which temporarily resolved the issue. However, the release of 4.3(4a) has now provided an official solution.

Links: Release Notes for Cisco UCS Manager, Release 4.3

Cisco UCS Manager Release 4.3(4a): New Optimized Adapter Policies for VIC Series Adapters

Starting with Cisco UCS Manager release 4.3(4a), Cisco has introduced optimized adapter policies for Windows, Linux, and VMware operating systems, including a new policy for VMware environments called “VMware-v2.” This update affects the Cisco UCS VIC 1400, 14000, and 15000 series adapters, promising improved performance and flexibility.

This release is particularly interesting for those managing VMware infrastructures, as many organizations—including ours—have been using similar settings for years. However, one notable difference is that the default configuration in the new policy sets Interrupts to 11, while in our environment, we’ve historically set it to 12.

Key Enhancements in UCS 4.3(4a)

Optimized Adapter Policies: The new “VMware-v2” policy is tailored to enhance performance in VMware environments, specifically for the Cisco UCS VIC 1400, 14000, and 15000 adapters. It adjusts parameters such as the number of interrupts, queue depths, and receive/transmit buffers to achieve better traffic handling and lower latency.
Receive Side Scaling (RSS): A significant feature available on the Cisco UCS VIC series is Receive Side Scaling (RSS). RSS is crucial for servers handling large volumes of network traffic as it allows the incoming network packets to be distributed across multiple CPU cores, enabling parallel processing. This distribution improves the overall throughput and reduces bottlenecks caused by traffic being handled by a single core. In high-performance environments like VMware, this can lead to a noticeable improvement in network performance. RSS is enabled on a per-vNIC basis, meaning administrators have granular control over which virtual network interfaces benefit from the feature. Given the nature of modern server workloads, enabling RSS on vNICs handling critical traffic can substantially improve performance, particularly in environments with multiple virtual machines.
Maximizing Ring Size: Another important recommendation for administrators using the VIC 1400 adapters is to set the ringsize to the maximum, which for these adapters is 4096. The ring size determines how much data can be queued for processing by the NIC (Network Interface Card) before being handled by the CPU. A larger ring size allows for better performance, especially when dealing with bursts of high traffic.In environments where high throughput and low latency are critical, setting the ring size to its maximum value ensures that traffic can be handled more efficiently, reducing the risk of packet drops or excessive buffering.

Links:

Exciting Update: Cisco Unveils UCS Manager VMware vSphere 8U2 HTML Client Plugin Version 4.0(0)

I am thrilled to share my experience with the latest UCSM-plugin 4.0 for VMware vSphere 8U2, a remarkable tool that has significantly enhanced our virtualization management capabilities. Having tested its functionality across an extensive network of approximately 13 UCSM domains and 411 ESXi 8U2 hosts. A notable instance of its efficacy was observed with Alert F1236, where the Proactive HA feature seamlessly transitioned the Blade into Quarantine mode, showcasing the plugin’s advanced automation capabilities.

However, I did encounter a challenge with the configuration of Custom Alerts, particularly Alert F1705. Despite my efforts, Proactive HA failed to activate, suggesting a potential misconfiguration on my part. To streamline this process, I propose the integration of Alert F1705 into the default alert settings, thereby simplifying the setup and ensuring more efficient system monitoring.

The release of Cisco’s 4.0(0) version of the UCS Manager VMware vSphere 8U2 HTML remote client plugin marks a significant advancement in the field of virtualization administration. This plugin not only offers a comprehensive physical view of the UCS hardware inventory through the HTML client but also enhances the overall management and monitoring of the Cisco UCS physical infrastructure.

Key functionalities provided by this plugin include:

Detailed Physical Hierarchy View: Gain a clear understanding of the Cisco UCS physical structure.
Comprehensive Inventory Insights: Access detailed information on inventory, installed firmware, faults, and power and temperature statistics.
Physical Server to ESXi Host Mapping: Easily correlate your ESXi hosts with their corresponding physical servers.
Firmware Management: Efficiently manage firmware for both B and C series servers.
Direct Access to Cisco UCS Manager GUI: Launch the Cisco UCS Manager GUI directly from the plugin.
KVM Console Integration: Instantly launch KVM consoles of UCS servers for immediate access and control.
Locator LED Control: Switch the state of the locator LEDs as needed for enhanced hardware identification.
Proactive HA Fault Configuration: Customize and configure faults used in Proactive HA for improved system resilience.

Links

Detailed Release Notes

Software download link

Please see the User Guide for specific information on installing and using the plugin with the vSphere HTML client.

Add F1705 Alert to Cisco UCS Manager Plugin 4.0(0)

New Cisco UCS firmware brings possibility to have notification about F1705 Alerts – Rank VLS.

In latest version of Cisco UCS Manager Plugin for VMware vSphere HTML Client (Version 4.0(0)) we could add Custom fault addition for proactive HA monitoring. How to do it?

Cisco UCS / Proactive HA Registration / Registered Fault / Add / ADDDC_Memory_Rank_VLS

If You can’t Add, it is necessary to Unregister UCSM Manager Plugin.

Cisco UCS / Proactive HA Registration / vCenter server credentials / Register

Cisco UCS / Proactive HA Registration / Register

How Could I check it? Edit Proactive HA / Providers

It is better use Name “ADDDC_Memory_Rank_VLS” without spaces. On my picture I used “My F1705 Alerts”

Adding Custom Alert is only possible with unregistered Cisco UCS Provider, it is better to do it immediatly after Cisco UCS Manager Plugin instalation.

Now I can deceided If I will block F1705 or NOT. I personaly preffer to have F1705 Alert under Proactive HA. Then I only restart Blades with F1705. During reboot Hard-PPR permanently remaps accesses from a designated faulty row to a designated spare row.

Links:

Downfall CVE-2022-40982 – Intel’s Downfall Mitigation

Security Fixes in Release 4.3(2b)

Defect ID – CSCwf30468

Cisco UCS M5 C-series servers are affected by vulnerabilities identified by the following Common Vulnerability and Exposures (CVE) IDs:

CVE-2022-40982—Information exposure through microarchitectural state after transient execution in certain vector execution units for some Intel® Processors may allow an authenticated user to potentially enable information disclosure through local access

CVE-2022-43505—Insufficient control flow management in the BIOS firmware for some Intel® Processors may allow a privileged user to potentially enable denial of service through local access

https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/gather-data-sampling.html

Workaround EVC Intel “Broadwell” Generation or gather_data_sampling

wget https://raw.githubusercontent.com/speed47/spectre-meltdown-checker/master/spectre-meltdown-checker.sh# sh spectre-meltdown-checker.sh --variant downfall --explain

EVC Intel “Skylake” Generation

CVE-2022-40982 aka 'Downfall, gather data sampling (GDS)'
> STATUS:  VULNERABLE  (Your microcode doesn't mitigate the vulnerability, and your kernel doesn't support mitigation)
> SUMMARY: CVE-2022-40982:KO

EVC Intel “Broadwell” Generation

CVE-2022-40982 aka 'Downfall, gather data sampling (GDS)'
> STATUS:  NOT VULNERABLE  (your CPU vendor reported your CPU model as not affected)
> SUMMARY: CVE-2022-40982:OK

Mitigation with an updated kernel

When an update of the microcode is not available via a firmware update package, you may update the Kernel with a version that implements a way to shut off AVX instruction set support. It can be achieved by adding the following kernel command line parameter:

gather_data_sampling=force

Mitigation Options

When the mitigation is enabled, there is additional latency before results of the gather load can be consumed. Although the performance impact to most workloads is minimal, specific workloads may show performance impacts of up to 50%. Depending on their threat model, customers can decide to opt-out of the mitigation.

Intel® Software Guard Extensions (Intel® SGX)

There will be an Intel SGX TCB Recovery for those Intel SGX-capable affected processors. This TCB Recovery will only attest as up-to-date when the patch has been FIT-loaded (for example, with an updated BIOS), Intel SGX has been enabled by BIOS, and hyperthreading is disabled. In this configuration, the mitigation will be locked to the enabled state. If Intel SGX is not enabled or if hyperthreading is enabled, the mitigation will not be locked, and system software can choose to enable or disable the GDS mitigation.

How to run Secure Boot Validation Script on an ESXi Host

Help for validation script:

/usr/lib/vmware/secureboot/bin/secureBoot.py -h
usage: secureBoot.py [-h] [-a | -c | -s]

optional arguments:
  -h, --help            show this help message and exit
  -a, --acceptance-level-check
                        Validate acceptance levels for installed vibs
  -c, --check-capability
                        Check if the host is ready to enable secure boot
  -s, --check-status    Check if UEFI secure boot is enabled

Check if the host is ready to enable secure boot

/usr/lib/vmware/secureboot/bin/secureBoot.py -c
Secure boot can be enabled: All vib signatures verified. All tardisks validated. All acceptance levels validated

Check if UEFI secure boot is disabled

/usr/lib/vmware/secureboot/bin/secureBoot.py -s
Disabled

Create Cisco UCS Boot Policy

Check if UEFI secure boot is enabled and working

/usr/lib/vmware/secureboot/bin/secureBoot.py -s
Enabled

vSphere Secure Boot

Field Notice: FN – 72368 – Some DIMMs Might Fail Prematurely Due to a Manufacturing Deviation – Hardware Upgrade Available

Cisco announced Field Notice: FN – 72368 – Some DIMMs Might Fail Prematurely Due to a Manufacturing Deviation – Hardware Upgrade Available

My personal recommendation please use ADDDC and PPR – It could prevent hardware failures … UCS-ML-128G4RT-H is in 2nd revision from 28-Oct-22.

Problem Description

A limited number of DIMMs shipped from Cisco are impacted by a known deviation in the memory supplier’s manufacturing process. This deviation might result in a higher rate of failure.

Background

DIMM manufacturers compose their DIMMs of multiple memory modules to reach the desired capacity. A 16GB DIMM might be composed of the same modules that a 32GB DIMM is composed of. In this case, a manufacturing deviation in specific modules impacts 16GB, 32GB, 64GB, and 128GB DIMMs. This deviation was contained to a specific date range, and the DIMMs which use these chips were manufactured during the middle to end of 2020. Since the discovery of this deviation, additional limits have been imposed on the manufacturing process to ensure that future DIMMs are not exposed to this process variation.

Problem Symptom

Most DIMMs with this manufacturing deviation will exhibit persistent correctable memory errors. If left untreated, the DIMMs might eventually encounter an uncorrectable memory event. If encountered during runtime, uncorrectable errors will cause a sudden unexpected server reset. If encountered during Power-On Self-Test (POST), the DIMM will be mapped out and the total available memory reduced. In some cases a boot error might be seen.

Various DIMM Reliability, Availability, and Serviceability (RAS) features or even operating system features might mask the extent of these correctable errors. It is recommended to check your DIMMs for exposure using the Serial Number Validation Tool described in the Serial Number Validation section of this field notice. Only specific DIMMs are impacted by this issue, so do not rely solely on the DIMM error count to judge exposure.

Workaround/Solution

This is a hardware failure. A replacement is strongly recommended in order to avoid potential for unexpected server failure.

How to Boot ESXi 7.0 on UCS-M2-HWRAID Boot-Optimized M.2 RAID Controller

VMware strongly advises that you move away completely from using SD card/USB as a boot device option on any future server hardware.

Removal of SD card/USB as a standalone boot device option (85685)

SD cards can continue to be used for the bootbank partition provided that a separate persistent local device to store the OSDATA partition (32GB min., 128GB recommended) is available in the host.
Preferably, the SD cards should be replaced with an M.2 or another local persistent device as the standalone boot option.

vSphere 7 – ESXi System Storage Changes

Please refer to the following blog:
https://core.vmware.com/resource/esxi-system-storage-changes

How to setup ESXi boot on UCS-M2-HWRAID ?

Create Disk Group Policies – Storage / Storage Policies / root / Disk Group Policies / M.2-RAID1

Create Storage Profile – Storage / Storage Profiles / root / Storage Profile M.2-RAID1

Create Local LUNs – Storage / Storage Profiles / root / Storage Profile M.2-RAID1

Modify Storage Profile inside Service Profile

Change Boot Order to Local Disk

Add F1705 Alert to Cisco UCS Manager Plugin

New Cisco UCS firmware brings possibility to have notification about F1705 Alerts – Rank VLS.

In latest version of Cisco UCS Manager Plugin for VMware vSphere HTML Client (Version 3.0(6)) we could add Custom fault addition for proactive HA monitoring. How to do it?

Cisco UCS / Proactive HA Registration / Fault monitoring details / Add / ADDDC_Memory_Rank_VLS

Cisco UCS / Proactive HA Registration / vCenter server credentials / Register

How Could I check it? Edit Proactive HA / Providers

Adding Custom Alert is only possible with unregistered Cisco UCS Provider, it is better to do it immediatly after Cisco UCS Manager Plugin instalation.

Links: