SU555: [Impact: Critical] BMC firmware 15.11 increases stability

Views:
3,005
Last Updated:
2024/4/30 23:55:14

收藏

Summary

[Impact: Critical]

NetApp® has released new BMC firmware for several platforms to mitigate several conditions related to BMC communication. The updated firmware is available from the System Firmware & Diagnostics Download on the NetApp Support site. BMC firmware 15.11 includes a fix for critical issue BUG ID 1603545: Excessive writes to SPI Flash can reduce its service life on BMC 15.x. NetApp strongly recommends performing this upgrade as soon as possible to improve stability and to avoid potential disruption.

Affected platforms: AFF A250, AFF C250, ASA A250, ASA C250, and FAS500f.

Issue Description

Platforms on BMC firmware versions less than version 15.11 are at risk for possible system disruptions due to communication issues with the BMC. The following bugs are addressed in BMC firmware version 15.11:

  • BUG ID 1577174: With BMC firmware version 15.10, watchdog NMI triggers because the IPMI interface is congested
  • BUG ID 1603545: Excessive writes to SPI Flash can reduce its service life on BMC 15.x
  • BUG ID 1604141: Potential BMC SoC hang during u-boot procedure prevents automatic recovery via BMC reset

Symptom

  • BUG ID 1577174
    • Due to IPMI interface congestion, the BT interface of the BMC running firmware version 15.10 might not properly handle watchdog reset commands from ONTAP. As a result, ONTAP triggers a watchdog non-maskable interrupt (NMI). The following is an example of a panic string associated with this issue:

Panic String: watchdog nmi because IPMI interface congested. in process idle: cpu5 on release 9.12.1P7 (C)

  • BUG ID 1603545
    • Watchdog timer handling by the BMC causes excessive write operations to SPI flash – reducing its service life significantly. If the SPI flash memory wears out, BMC operation is disrupted due to write failures.
  • BUG ID 1604141
    • Vendor errata indicates a potential BMC hang in the u-boot of the BMC can disrupt BMC operation when the BMC reboots.

Solution

Upgrade BMC firmware from System Firmware & Diagnostics Download on the NetApp Support site.

Additional Information

Active IQ System Risk Detection:

For customers who have enabled AutoSupport on their storage systems the Active IQ Portal provides detailed System Risk reports at the customer and site and system levels. The reports show systems that have specific risks as well as severity levels and mitigation action plans. Drives that are not running the latest firmware is an example of such a risk. Not upgrading to the most current drive firmware could leave the storage appliance vulnerable to undesirable behavior.

Important: The purpose of this communication is for NetApp to notify its installed base end users about urgent and important product information that may affect product performance or reliability. The information contained herein and the distribution lists are NetApp confidential materials that are subject to restrictions on redistribution and that cannot be shared outside of this e-mail distribution list.

***************************************************
*** NETAPP CONFIDENTIAL – FOR LIMITED USE ONLY ***
***************************************************