Field Notice: FN – 72368 – Some DIMMs Might Fail Prematurely Due to a Manufacturing Deviation – Hardware Upgrade Available

Cisco announced Field Notice: FN – 72368 – Some DIMMs Might Fail Prematurely Due to a Manufacturing Deviation – Hardware Upgrade Available

My personal recommendation please use ADDDC and PPR – It could prevent hardware failures … UCS-ML-128G4RT-H is in 2nd revision from 28-Oct-22.

Problem Description

A limited number of DIMMs shipped from Cisco are impacted by a known deviation in the memory supplier’s manufacturing process. This deviation might result in a higher rate of failure.

Background

DIMM manufacturers compose their DIMMs of multiple memory modules to reach the desired capacity. A 16GB DIMM might be composed of the same modules that a 32GB DIMM is composed of. In this case, a manufacturing deviation in specific modules impacts 16GB, 32GB, 64GB, and 128GB DIMMs. This deviation was contained to a specific date range, and the DIMMs which use these chips were manufactured during the middle to end of 2020. Since the discovery of this deviation, additional limits have been imposed on the manufacturing process to ensure that future DIMMs are not exposed to this process variation.

Problem Symptom

Most DIMMs with this manufacturing deviation will exhibit persistent correctable memory errors. If left untreated, the DIMMs might eventually encounter an uncorrectable memory event. If encountered during runtime, uncorrectable errors will cause a sudden unexpected server reset. If encountered during Power-On Self-Test (POST), the DIMM will be mapped out and the total available memory reduced. In some cases a boot error might be seen.

Various DIMM Reliability, Availability, and Serviceability (RAS) features or even operating system features might mask the extent of these correctable errors. It is recommended to check your DIMMs for exposure using the Serial Number Validation Tool described in the Serial Number Validation section of this field notice. Only specific DIMMs are impacted by this issue, so do not rely solely on the DIMM error count to judge exposure.

Workaround/Solution

This is a hardware failure. A replacement is strongly recommended in order to avoid potential for unexpected server failure.

Quick Tip – Automating ESXi 8.0 install using…

Quick Tip – Automating ESXi 8.0 install using…

For those looking to install ESXi 8.0 but have an unsupported CPU, the following kernel boot option allowLegacyCPU=true can be added which would bypass the installer pre-check as shown in the screenshot below. When the ESXi installer bypass happens, instead of an error which forces you to reboot,[…]


VMware Social Media Advocacy

Quick Tip – Accessing new custom theme editor…

Quick Tip – Accessing new custom theme editor…

While looking over the vSphere 8 release notes, I had noticed there was also release notes for new version of the ESXi Embedded Host Client, which is an HTML 5 UI for accessing standalone ESXi host. The really interesting feature that stood out to me was the following: Ability to change the […]


VMware Social Media Advocacy

vSphere with Tanzu – new TKG 2.0 ClusterClass…

vSphere with Tanzu – new TKG 2.0 ClusterClass Preview – CormacHogan.com

vSphere with Tanzu – new TKG 2.0 ClusterClass…

One of they key features of the TKG 2.0 on vSphere 8 announcement at VMware Explore 2022 is the consolidation of our the Tanzu Kubernetes offerings into a single unified Kubernetes runtime. This can be considered the second edition of VMware Tanzu Kubernetes Grid. It will still come in two flavors.


VMware Social Media Advocacy