The 99th Installment
Expressing the Date and Time
by Hisashi Koyama,
Professor, Master Program of Information Systems Architecture
In our current lives, we constantly deal with the concept of date and time, such as “Wednesday, April 3, 2018, 11:00”. Although we are rarely confused, because we use date and time on a daily basis, the concept of date and time is actually quite complex.
The Western calendar, which takes the year following the year in which Christ is said to have been born as the first year, is the prevailing chronological system in many countries. Although the Western calendar is commonly used in Japan, Japanese public institutions generally use the Japanese calendar as the chronological system. In this sense, both the Japanese calendar and the Western calendar are commonly used in Japan. The Japanese calendar is Japan’s unique chronological system that expresses the year according to Japan’s unique era name (title and unit), such as the Heisei era, and the number of years. Currently, the era is only changed when the emperor succeeds to the throne, but in the past, it was changed for various reasons. The first era name was Taika that is known from the Taika Reforms, and Heisei is apparently the 250th era name, although there were some overlaps during the Namboku-cho period (1336–1392) and some periods without an era name.
Since the calendar year is determined by the same calendar method called the Gregorian calendar (the so-called new calendar) for both the current Western and Japanese calendar, the conversion between the Western and Japanese calendars is simple. However, because the lunar solar calendar, the sexagenary cycle, and old calendars such as the Julian calendar are sometimes used or have been used, it is necessary to be aware of date deviations when considering historical events. In addition, the lunar calendar is still used for the Islamic calendar.
For both the Western and Japanese calendars, seconds and minutes are expressed on a scale of 60, hours are expressed on a scale of 12 or 24, and months are expressed on a scale of 12. For days of the month there are 30 days or 31 days, of in the case of February, 28 days or 29 days in leap years. Incidentally, although many people are puzzled because there are very used to the decimal systems in their daily lives when I talk about things such as binary, octal, and hexadecimal, scales in the basic information course, this puzzlement largely subsides if I use the expression of date and time as an example. I also mention the anecdote that the sexagesimal system (scale of 60) was first used around 3000 BCE by the Sumerian civilization, which is rumored to be a civilization that suddenly emerged at the beginning of the Mesopotamian civilization that is known as an ancient civilization.
For time, there is the Coordinated Universal Time UTC, which corresponds to the previous Greenwich Mean Time, and it is set by region. Currently, there are 39 types of standard time zones, and they are indicated by the time difference from the Coordinated Standard Time UTC. Japan Standard Time (JST) is UTC-9 and Australian Eastern Standard (AEST) is UTC-10, so Japan’s time is one hour behind Sydney. Daylight saving time is set during the summer months in some regions, and the setting of daylight saving time also varies from region to region. Since Australia is in the Southern Hemisphere, Japan time is delayed by two hours due to the change of AEST to UTC-11 during daylight saving time until the first Sunday in April.
When written out like this, the dates and times that we use in our daily lives also seem complicated in many ways. Because the handling of date and time is indispensable for current information systems, next I would like to talk about date and time in the IT field.
The issue of how to handle the Year 2000, called the Y2K problem, rose to prominence 20 years ago in the IT field. With old software, it was common practice to simply abbreviate a year such as “1998” as “98” in two digits in order to save money, and the year 2000 was mistakenly treated as things like “00”, “1900”, or “19100”. Around the year 2000, IT that supported social infrastructure was already widespread, and although was a sense of crisis in fear of malfunction of IT equipment due to the Year 2000 problem, the preventive measures taken were successful and major problems were avoided. Perl, which was popular for CGI for the web at that time, was born in 1986 and was still a young programming language. However, to bring attention to misuse because it had a local time function for the specification which would lead to the Year 2000 problem, an article entitled “Year 2000 Compliance: Lawyers, Liars, and Perl” which boasted that Perl was compliant for Year 2000 was released through the Perl official website.
Let’s continue with the topic of how date and time are handled inside software. In Unix-based OS or programming languages, the current time can often be obtained as the number of seconds that have elapsed from the epoch. The epoch started January 1, 1970 00:00 UTC, and was 1522720800 as of Tuesday, April 3, 2018 11:00. Because the data type of the object that stores these elapsed seconds is the time_t type, which used to be a 32-bit signed integer (signed int), it could overflow and cause a malfunction 20 years later, in 2038. This is known as the Year 2038 problem. Although recent OS and programming language processing systems implement the time_t type as a 64-bit signed integer, it is not as if all the software currently being used has addressed this problem.
The expression of the date and time as the number of seconds that have elapsed since the epoch is extremely simple. Although you can calculate the difference of these values if you want to check the elapsed time of something, because it is not so easy to display the date and time, there are several methods available for each programming language. In Python, the number of seconds from an epoch can be retrieved as a float type with the time function, and it is usually handled as a time.struct_time or datetime.datetime, which holds information in units of year, month, day, hour, minute, second, etc. In addition, software often converts date and time information into a character string such as “Tuesday, 04/03/2018 11:00” at the input/output stage. There is an ISO 8601 format for date and time, which is written by connecting the date and time with T as follows, but it is not really commonly used. The former is a basic format, and the latter is an extended format. For UTC, a Z is added at the end, and for other time zones, +0900, etc. is added at the end.
20180403T110000Z
2018-04-03T20:00:00+09:00
Although ISO 8601 can be enforced in the case of a software to software input/output, it is necessary to use a format that is popular in everyday life in consideration of users. There is a variety of date formats available.
2018/04/03
04/03/2018
03/04/2018
As seen above, the order of the dates varies between ISO, Japan, the US, and the UK.
20180403
2018-04-03
2018/04/03
2018/4/3
2018 April 3
April 3rd, 2018
As shown above, there are many choices in Excel value formatting, such as using various delimiters, adding the day of the week, or adding the morning and afternoon to the time. Of course, there are various functions for these formats, and it is necessary of locales that are degrees of freedom in addition to the time of day for regions.
Date and time information may be stored in the following four types of data in a Python script.
• str type that expresses the date in a specific format
• float type for the elapsed time from an epoch
• datetime type or struct_time type unique to Python
Although not all of them are orthogonal, there are functions for mutual conversions, so the issue can be resolved just by understanding the characteristics of these data types and being aware of which data type they are holding. However, there are many scripts that are working miraculously after much puzzlement.
IT engineers including programmers need to understand common sense and knowledge in everyday life, in addition to IT-related knowledge and skills such as programming language grammar. In particular, programmers often lack an understanding of the date and time expressions and character expressions, such as character codes, even though it is programmers who should understand them the most. It should be assumed that current software will be used from multiple regions.