Certain Solaris SPARC Systems may Experience 95 Year Time Jumps |
|
| Category : | AvailabilityData Loss |
| Release Phase : | Resolved |
| Bug Id : | 6724580
|
| Product : | Solaris 8 Operating System Solaris 9 Operating System Solaris 10 Operating System OpenSolaris
|
| Date of Workaround Release : | 27-Feb-2009
|
| Date of Resolved Release : | 22-Apr-2009
|
Certain Solaris SPARC systems may experience 95 year time jumps:
1. Impact
Due to an issue in the todm5819p_rmc driver, in very rare circumstances, certain SPARC systems (listed below) with a
particular motherboard Real Time Clock may be exposed to a system time
jump backwards of 2,985,657,855 seconds (approximately 95 years). This
may subsequently be followed by a jump forward to 00:00 January 1st
1970 UTC (the unix epoch).
This may lead to file and database records having incorrect timestamps.
The datestamps which pre-date the unix epoch may cause disruption of the affected system, including, but not limited to:
- inability to open files
- inability to run executables
- inability to provide certain network services
- inability to reboot system normally
2. Contributing Factors
This issue can occur in the following releases:
SPARC Platform
- Solaris 8 without patch 117350-62
- Solaris 9 without patch 139384-01
- Solaris 10 without patch 139514-01
- OpenSolaris based upon builds snv_01 through snv_97
On the following systems:
- Sun Fire V125 Server
- Sun Fire V210 Server
- Sun Fire V240 Server
- Sun Fire V250 Server
- Sun Fire V440 Server
- Netra 210 Server
- Netra 240 Server
- Netra 440 Server
Note 1: OpenSolaris
distributions may include additional bug fixes above and beyond the
build from which it was derived. To determine the base build of
OpenSolaris, the following command can be used:
$ uname -v
snv_86
Note 2: To determine if a system is one of the types listed above which uses the affected "todm5819p_rmc" Real Time Clock, the following command can be run :
$ modinfo | grep todm5819p_rmc
25 131f068 940 - 1 todm5819p_rmc (tod module for ALI M5819P)
If no output is shown, then the system is not using the affected todm5819p_rmc driver.
Note 3: Systems are only
impacted by this issue if both the following conditions are met:
1. The system is configured as an NTP client or if software, other than
NTP, is making frequent adjustments to the system time. To determine if
a system is configured as an NTP client, use the "ntpq(1M) -c peers"
command to list the peers.
For example:
$ ntpq -c peers
remote refid st t when poll reach delay offset disp
==============================================================================
+mf-usan-01 nm-ubur-01.East 4 - 31 64 377 0.41 13.640 11.63
*mf-usan-02 nm-ubur-01.East 4 m 48 64 376 0.44 -7.088 2.85
If no peers are shown, the system is not configured as an NTP client.
2. The Real Time Clock is being trusted as a clock source by Solaris.
This is the default configuration. To determine if Solaris is using the
Real Time Clock, execute the following command:
$ egrep 'dosynctodr|tod_broken' /etc/system
set dosynctodr=0
set tod_broken=1
If the output includes these two lines, then the workaround for this issue has been applied. As such, Solaris is NOT using the Real Time Clock as a trusted clock source (see the workaround in section 4 below).
3.
Symptoms
If the described issue occurs, systems will initially show a time jump
backwards by 2,985,657,855 seconds relative to the present and the
date(1) command will show a year in the early 1900s:
$ date
Fri Jul 10 08:24:07 BST 1914
This jump will also be reflected in system and application logfiles and
database records. For example:
$ tail -3 /var/cron/log
CMD: [ -x /usr/lib/gss/gsscred_clean ] && /usr/lib/gss/gsscred_clean
root 10996 c Tue Apr 28 03:30:00 1914
root 10996 c Tue Apr 28 03:30:00 1914
This may also lead to various messages when utilities cannot cope with
the date:
"Value too large for defined data type"
ps(1) and mdb(1) will fail to run, giving the error:
"getexecname() failed"
Console error messages relating to network services may be seen, :
May 17 02:34:23 v4u-v240a-gmp02 nis_cachemgr: [ID 105279 daemon.error] nis_cast: t_open:
/dev/udp:Not enough space
On Solaris 10, SMF commands svcs(1) and svcadm(1M) used with system
time prior to 00:00 January 1st 1970 UTC, will cause the SMF database
to become inaccessible and cause the following errors:
svc_nonpersist.db: Value too large for defined data type
svcs: Could not bind to repository server: repository server unavailable. Exiting.
svc.configd exited with status 102 (database initialization failure)
svc.configd: Fatal error: /etc/svc/volatile/svc_nonpersist.db: integrity check failed. Details in
/etc/svc/volatile/db_errors
Requesting System Maintenance Mode
See /lib/svc/share/README for more information.
A system reboot may generate the following message on the console:
"WARNING: Time-of-day chip unresponsive; dead batteries?"
This message is spurious and should be ignored, the batteries are not
dead.
This issue may also cause a system to jump to 00:00 January 1st 1970
UTC. When this occurs, the system utilities will function, but
timestamps will be incorrectly set in files, logs , databases etc.
4.
Workaround
If this issue has occurred, the
date can be corrected using the date(1) command as shown in the example
below:
# date 021816122009
Wed Feb 18 16:12:00 GMT 2009
If production data has been corrupted with early datestamps, then the
affected application/database data must be restored from an unaffected backup.
Until patches can be applied, this issue may be avoided by performing both of the following steps:
A) Provide immediate relief by executing the following commands as root:
# echo 'tod_broken/W 1' | mdb -kw
# echo 'dosynctodr/W 0' | mdb -kw
NOTE: Extreme care must be taken when executing these commands, since any deviation from the required text will cause unpredictable changes which may be catastrophic to the system.B) To make this change persistent across a reboot add the following lines to the file "/etc/system":
set tod_broken = 1
set dosynctodr = 0
Note: These two lines can be safely removed once the fix patch has been applied. This is most efficiently carried out by installing the patch and then removing these two lines from "/etc/system" before performing the system reboot associated with the patch installation.
5.
Resolution
This issue is addressed in the following releases:
SPARC Platform
- Solaris 8 with patch 117350-62 or later
- Solaris 9 with patch 139384-01 or later
- Solaris 10 with patch 139514-01 or later
- OpenSolaris based upon builds snv_98 or later
This Sun
Alert notification is being provided to you on
an "AS IS"
basis. This Sun Alert notification may contain information provided by
third parties. The issues described in this Sun Alert notification may
or may not impact your system(s). Sun makes no representations,
warranties, or guarantees as to the information contained herein. ANY
AND ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR
NON-INFRINGEMENT, ARE HEREBY DISCLAIMED. BY ACCESSING THIS DOCUMENT YOU
ACKNOWLEDGE THAT SUN SHALL IN NO EVENT BE LIABLE FOR ANY DIRECT,
INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES THAT ARISE OUT
OF YOUR USE OR FAILURE TO USE THE INFORMATION CONTAINED HEREIN. This
Sun Alert notification contains Sun proprietary and confidential
information. It is being provided to you pursuant to the provisions of
your agreement to purchase services from Sun, or, if you do not have
such an agreement, the Sun.com Terms of Use. This Sun Alert
notification may only be used for the purposes contemplated by these
agreements.
Copyright 2000-2009 Sun Microsystems, Inc., 4150 Network Circle,
Santa
Clara, CA 95054 U.S.A. All rights reserved.Modification History23-Apr-2009: Updated the Impact section.
02-Apr-2009: Updated the Workaround section.
22-Apr-2009: Updated Contributing Factors and Resolution sections. Resolved.
AttachmentsThis solution has no attachment