Sun Fire T1000/T2000 With Certain QGE and Dual-10GB PCI Cards Installed May Experience System Hangs due to Platform Specific Issue |
|
| Category : | Availability |
| Release Phase : | Resolved |
| Product : | Sun Fire T2000 Server Sun Fire T1000 Server
|
| Bug Id : | 6471063, 6362138
|
| Date of Resolved Release : | 26-JAN-2007
|
Impact
Sun Fire T1000/T2000 with QGE (X4447A-Z, X4446A-Z) and Dual-10GB (X1027A-Z) PCI cards installed may experience system hangs due to a platform specific issue. A system deadlock can occur due to an architectural limitation in Fire 2.0, where stalled PIO traffic can cause DMA read returns to become blocked. The stalled PIO traffic ultimately times out and causes a hang. Mis-diagnosis of this issue may prompt unnecessary replacement of the above mentioned cards.
Contributing Factors
This issue can occur on the following platforms:
- Sun Fire T1000/T1000+ systems with firmware prior to 6.1.12 (without patch 122431-05)
- Sun Fire T1000/T1000+ systems with firmware prior to 6.3.0 (without patch 124751-01)
- Sun Fire T2000/T2000+ systems with firmware prior to 6.1.12 (without patch 122430-05)
- Sun Fire T2000/T2000+ systems with firmware prior to 6.3.0 (without patch 124750-01)
This issue described by BugID 6362138 (Fire deadlock), may occur on T1000/T2000 platforms with the following PCI cards installed:
- QGE: X4447A-Z, X4446A-Z
- Dual 10GB: X1027A-Z
To determine the firmware revision on the system, the following command can be run from the System Controller:
sc> showhost
Sun-Fire-T2000 System Firmware 6.3.0 2006/11/10 06:46
Host flash versions:
Hypervisor 1.3.0 2006/11/10 06:35
OBP 4.25.0 2006/11/07 23:24
POST 4.25.0 2006/11/08 00:08
Symptoms
The deadlock issue does not tend to manifest as a system panic. Instead what is generally seen is a system hang. Afterwards, if the FMA error logs are examined (either with "fmdump -e" from Solaris after reboot or with "showfmerptlog1" from the SC prompt), completion timeout errors are seen as in the following example:
# fmdump -e
TIME CLASS
Aug 04 2006 16:59:27.463403200 ereport.io.fire.pec.cto
Aug 04 2006 16:59:27.598156480 ereport.io.fire.pec.cto
Workaround
There is no workaround for this issue. Please see the Resolution section below.
Resolution
This issue is addressed on the following platforms:
- Sun Fire T1000/T1000+ systems with patch 122431-05 or later
- Sun Fire T1000/T1000+ systems with patch 124751-01 or later
- Sun Fire T2000/T2000+ systems with patch 122430-05 or later
- Sun Fire T2000/T2000+ systems with patch 124750-01 or later
Note: The above patches should be applied to each T1000/T2000 system that is utilizing the affected PCI cards (described in the Contributing Factors section).
AttachmentsThis solution has no attachment