SU554: [Impact: Critical] HUH721212AL5204 firmware to prevent data disruption or unavailability

Views:
1,243
Last Updated:
2024/5/7 19:52:17

收藏

Summary

[Impact: Critical = cluster data outage.]

NetApp® has identified that the drive model in the table below is subject to failure at a much higher rate than other drives shipped by NetApp. NetApp has implemented a drive firmware fix that can be upgraded to mitigate the issue. This firmware released January 8th, 2024. The updated firmware is available from the E-Series Disk Firmware page on the NetApp Support site. NetApp strongly recommends performing this upgrade as soon as possible to avoid potential disruption.

Part Number Drive Identifier Capacity Firmware
E-X4131A HUH721212AL5204 12TB NE02
E-X4132A HUH721212AL5204 12TB NE02

Issue Description

Drives on firmware versions less than those listed in the table above are at risk for a higher-than-expected rate of failure due to failures that can present as media errors or write failures. Some failures can lead to data loss, disruption, or unavailability, if multiple drives fail simultaneously:

  • In a multiple drive failure scenario, RAID limits may be exceeded, in which case a Volume Group would go Offline (or fail), and data in cache could be lost.
  • In a single drive failure scenario, a drive will likely be failed for degraded performance. This would result in a degraded volume group.
  • Note: In any event where more drives are impacted than RAID tolerance, immediate engagement with Technical Support is strongly recommended.

Symptom

  • The events below can be found in major-event-log.txt file or in SANtricity System Manager under Support > Events:

B:12/18/23, 10:27:29 AM (10:27:29) 27591 100a Drive returned CHECK CONDITION - Shelf 0, Drawer 4, Bay 10
----> Sense 1/b/97 = Recovered Error - Error Recovery Procedure Warning Low - CDB: 0x7f(0xb) = Write(32) - LBA: ~0x31cd4f400

B:12/18/23, 10:27:29 AM (10:27:29) 27588 1016 Drive returned unrecoverable media error - Shelf 0, Drawer 4, Bay 10
----> Sense 3/11/0 = Medium Error - Unrecovered read error - CDB: 0x7f(0x9) = Read(32) - LBA: ~0x5538bb000

  • Some drives are marked in impending failure state then copied over to hot spare or preservation capacity.
  • Others may fail due to write failure.

Solution

Upgrade firmware per the summary. Please note that after the firmware upgrade, some media errors and failures are possible until the controller completes a firmware media scan cycle, which may take up to 30 days. For more details on the E-Series media scan, please see this article: What is Media Scan on E-Series storage systems?

Additional Information

See Bug # 1583382

In accordance with the Support Services terms, always update NetApp products with the latest version of firmware and software to provide the best reliability, availability, and serviceability:

Hot spare drives: To best maintain the continuous presence of hot spare drives available in the system, adhere to Hot Spares Best Practices and follow the standard drive replacement process if a drive fails.

Active IQ System Risk Detection:

For customers who have enabled AutoSupport on their storage systems the Active IQ Portal provides detailed System Risk reports at the customer and site and system levels. The reports show systems that have specific risks as well as severity levels and mitigation action plans. Drives that are not running the latest firmware is an example of such a risk. Not upgrading to the most current drive firmware could leave the storage appliance vulnerable to undesirable behavior.

Important: The purpose of this communication is for NetApp to notify its installed base end users about urgent and important product information that may affect product performance or reliability. The information contained herein and the distribution lists are NetApp confidential materials that are subject to restrictions on redistribution and that cannot be shared outside of this e-mail distribution list.

***************************************************
*** NETAPP CONFIDENTIAL – FOR LIMITED USE ONLY ***
***************************************************