Monday, July 4, 2011

What every programmer should know about time

SkyHi @ Monday, July 04, 2011
Some notes about time:

UTC: The time at zero degrees longitude (the Prime Meridian) is called Universal Coordinated Time (UTC).
GMT: UTC used to be called Greenwich Mean Time (GMT) because the Prime Meridian was (arbitrarily) chosen to pass through the Royal Observatory in Greenwich.
Other timezones can be written as an offset from UTC. Australian Eastern Standard Time is UTC+1000. e.g. 10:00 UTC is 20:00 EST on the same day.
Daylight saving does not affect UTC. It's just a polity deciding to change its timezone (offset from UTC). For example, GMT is still used: it's the British national timezone in winter. In summer it becomes BST.
Leap seconds: By international convention, UTC (which is an arbitrary human invention) is kept within 0.9 seconds of physical reality (UT1, which is a measure of solar time) by introducing a "leap second" in the last minute of the UTC year, or in the last minute of June.
Leap seconds don't have to be announced much more than six months before they happen. This is a problem if you need second-accurate planning beyond six months.
Unix time: Measured as the number of seconds since epoch (the beginning of 1970 in UTC). Unix time is not affected by time zones or daylight saving.
According to POSIX.1, Unix time is supposed to handle a leap second by replaying the previous second. e.g.:

59.00 ← replay
00.00 ← increment

This is a trade-off: you can't represent a leap second, and your time is guaranteed to go backwards. On the other hand, every day is exactly 86,400 seconds long, and you don't need a table of all previous and future leap seconds in order to format Unix time as human-preferred hours-minutes-seconds.
ntpd is supposed to make the replay happen after it sees the "leap bits" from upstream timeservers, but I've also seen it do nothing: the system goes one second into the future, then slowly slews back to the correct time.

What every programmer should know about time:

Timezones are a presentation-layer problem!
Most of your code shouldn't be dealing with timezones or local time, it should be passing Unix time around.
When measuring time, measure Unix time. It's UTC. It's easy to obtain. It doesn't have timezone offsets or daylight saving (or leap seconds).
When storing time, store Unix time. It's a single number.
If you want to store a humanly-readable time (e.g. logs), consider storing it along with Unix time, not instead of Unix time.
When displaying time, always include the timezone offset. A time format without an offset is useless.
The system clock is inaccurate.
You're on a network? Every other system's clock is differently inaccurate.
The system clock can, and will, jump backwards and forwards in time due to things outside of your control. Your program should be designed to survive this.
The number of [clock] seconds per [real] second is both inaccurate and variable. It mostly varies with temperature.
Don't blindly use gettimeofday(). If you need a monotonic (always increasing) clock, have a look at clock_gettime().
ntpd can change the system time in two ways:
Step: making the clock jump backwards or forwards to the correct time instantaneously.
Slew: changing the frequency of the clock so that it slowly drifts toward the correct time.
Slew is preferred because it's less disruptive, but it's only useful for correcting small offsets.

Special mentions:

Time passes at a rate of one second per second for every observer. The frequency of a remote clock relative to an observer is affected by velocity and gravity. The clocks inside GPS satellites are adjusted for relativistic effects.
MySQL (at least 4.x and 5.x) stores DATETIME columns as a "YYYY-MM-DD HH:MM:SS" string. I'm not even kidding. If you care at all about storing timestamps, store them as integers and use the UNIX_TIMESTAMP() and FROM_UNIXTIME() functions.