On Sun Fire Servers, Using Dynamic Reconfiguration (DR) on System Boards With Permanent Memory May Fail Due to a Non Responsive QLC Driver |
|
| Category : | Availability |
| Release Phase : | Resolved |
| Product : | Sun Fire 12K Server Sun Fire 3800 Server Sun Fire 4800 Server Sun Fire 4810 Server Sun Fire 6800 Server Sun Fire 15K Server
|
| Bug Id : | 4925218
|
| Date of Workaround Release : | 27-FEB-2004
|
| Date of Resolved Release : | 18-MAR-2004
|
Impact
On Sun Fire servers, using Dynamic Reconfiguration (DR) on system boards with permanent memory may not complete due to a non responsive QLC driver.
Contributing Factors
This issue can occur in the following releases:
SPARC Platform
on the following platforms:
-
Sun Fire Servers 3800/4800/4810/ 6800
-
Sun Fire 12K/15K
The described issue only occurs under the following conditions:
1. On Host Bus Adaptors with QLC drivers installed.
2. During DR operations (detaching the system board using "cfgadm -c unconfigure" from the domain) on system boards with permanent memory.
To determine if a system is equipped with QLC drivers, run the following command:
$ modinfo | grep qlc
46 10291f33 25ff0 153 1 qlc (Qlogic FCA Driver v0.xxx)
To determine which system board hosts permanent memory, apply the following command as root:
# cfgadm -val | grep permanent
N0.SB0::memory connected configured ok
base address 0x0, 4194304 KBytes total, 720472 KBytes permanent
In the example above, SB0 is the system board hosting permanent memory.
Symptoms
Should the described issue occur, messages similar to the following are logged to the "/var/adm/messages" file:
<date> <system> qlc: [ID 499440 kern.warning] WARNING: qlc(1) Fail suspend busy 1 flags 1184c
<date> <system> sbdp: [ID 868600 kern.warning] WARNING: sbdp: failed to quiesce OS for copy-rename
<date> <system> sbd: [ID 761800 kern.warning] WARNING: sbd_detach_memory: failed to move memory from board 0 to board
Note: A "non-zero" value will be seen when this issue occurs.
Workaround
To work around the described issue, do the following:
1. Find the qlc instance which is exhibiting the failure:
# egrep "qlc.*Fail suspend busy" /var/adm/messages|tail -1
Oct 2 16:54:07 v4u-4800d-doma qlc: [ID 499440 kern.warning] WARNING: qlc(1)
Fail suspend busy 1 flags 11844
In the above example, the qlc instance which is exhibiting the failure is "1"
2. Take the failing instance number and run grep "<instance> \"qlc\"" /etc/path_to_inst
# grep "1 \"qlc\"" /etc/path_to_inst
"/ssm@0,0/pci@1d,600000/SUNW,qlc@1" 1 "qlc"
3. Take the pci nexus (the text corresponding to "pci@1d,600000" in step 2) and run cfgadm -val|grep <pci nexus>
# cfgadm -val|grep pci@1d,600000
N0.IB8::pci1 connected configured ok device /ssm@0,0/pci@1d,600000
4. Take the attachment point ID (in this example "N0.IB8::pci1") and run:
# cfgadm -c unconfigure N0.IB8::pci1
# cfgadm -c configure N0.IB8::pci1
Resolution
This issue is addressed in the following releases:
SPARC Platform
Modification HistoryDate: 18-MAR-2004
-
State Resolved
-
Updated Contributing Factors and Resolution sections
AttachmentsThis solution has no attachment