SU521: [Impact High] Stability improvements for MetroCluster over IP on FAS2750, AFF-A220 and AFF-A250 platforms

Views:
1,058
Last Updated:
12/2/2022, 12:44:59 AM

收藏

Summary

[Impact High: Possible node disruption]

Several disruptive issues have been identified which may impact the stability of the MetroCluster over IP solutions on FAS2750, AFF-A220 and AFF-A250 platforms running ONTAP 9.8 or above in the event of certain network error conditions.

NetApp has worked to identify and resolve these issues and has recently delivered fixes which address them.

This bulletin contains details of these issues, and the releases in which they are fixed. NetApp strongly recommends that customers with MetroCluster over IP solutions running ONTAP 9.8 or above should consider upgrading to one of the releases documented in the Solution section of this bulletin.

Issue Description

Network error conditions that result in “link down” or other interruptions in network availability for the MetroCluster IP interfaces can result in unexpected node reboots. In extreme conditions may result in an interruption of service.

Situations known to trigger these disruptions include:

  • "Link down" events during switch reboot
  • Network re-cabling
  • Switch updates
  • RCF updates
  • Other network errors that result in "link down"

Symptom

Depending on the actual issue experienced, a panic string similar to one of the following may be reported:

Panic string: page fault (supervisor write data,page not present) on VA 0x18 cs:rip 0x20:0xffffffff80328245 rflags 0x10246

Panic string: BUG ON wqe == ((void *)0) failed at src/siw_qp_tx.c:1582 in process siw_irq_thread

Panic string: vm_fault: fault on nofault entry, addr: 0xfffffe0002058000 pc: 0xffffffff80a095d9 in process siw_sq2

Solution

The issues described in this bulletin are (or will be) addressed in the following ONTAP releases:

  • For ONTAP 9.8, the fixes are available as of the 9.8P15 release, available here
  • For ONTAP 9.9.1, the fixes are available as of the 9.9.1P12 release, available here
  • For ONTAP 9.10.1, the fixes are available as of the 9.10.1P8 release, available here
  • For ONTAP 9.11.1, the fixes are available as of the 9.11.1P2 release, but due to other unrelated issues, it is recommended that 9.11.1P4 (or later) be used. 9.11.1P4 is available here

Subsequent service updates and major ONTAP releases will also carry these fixes.

Customers are advised to use the ActiveIQ Upgrade Advisor tool when planning an upgrade to a fixed release.

Additional Information

The following Bug IDs give additional information regarding the issues described in this bulletin: