The 120th Installment
Uncertainties
by Hiroshi Hashimoto,
Dean and Professor
In many fields, uncertainties are often replaced with random variables. For this reason, probability theory appears in all fields. For example, it is used in statistics, system control engineering, financial engineering, and bioengineering. There are numerous examples of probability theory coming into the limelight. Let me introduce some of them.
The first is the Kalman filter [1]. NASA, which was having trouble improving the accuracy of its orbit estimation for a manned lunar spacecraft due to noise superimposed on observed data, adopted the Kalman filter to solve this problem, and it proved effective.
The second is that financial engineering, which deals with stock prices and exchange rates and engineers asset utility, has established itself on the basis of stochastic differential equations. Stochastic differential equations were developed based on the pioneering research of Kiyoshi Ito (Japan, 1915-2008) in “Probability Theory” (1952) and Doob (USA, 1910-2004) in “Stochastic Process” (this book was so difficult to understand that it was called “yellow peril” and I was very distressed). After Myron S. Scholes (Canada/US, 1941-) published the Black-Scholes equation, he then worked with Robert C. Merton (US, 1944-) to develop a theory of equations and establish a new derivative pricing model. This achievement led the both of them to be awarded the 1997 Nobel Prize in Economics. This theory was based on the work of Kiyoshi Ito, so when M. Sholoes met him, he immediately shook his hand and praised his theory. Note that the ironic result was that Long Term Capital Management (LTCM), a huge mutual fund with M. Sholoes and R. Merton on its management team, collapsed in 1998 while incurring huge losses.
The third is Bayesian statistics, which has been introduced into big data processing, recently in the headlines. This is characterized by the fact that it can be effectively applied to both small and large amounts of data, using conditional probability (or conditional probability distribution) but also allowing for the inclusion of subjective theories, and sequentially updating the distribution estimates (not parameter values or confidence intervals as in conventional statistics) for each observation value obtained. He gathered attention for his method of finding a sunken American submarine [2]. By the way, it was also used in the US TV drama Numbers (Season 1) for man hunts. Bayesian statistics did not see the light of day for some time, but as computing power improved, computational models evolved and came to be used in a wide range of fields, including time series analysis (speech analysis is a prime example), network transaction analysis, and marketing analysis.
All of the above examples are places that treat uncertainty as a random variable. Albert Einstein (1879-1955, German theoretical physicist), famous for his theory of relativity, questioned this and said, “God does not play dice with the universe” in reference to quantum mechanics. Quantum mechanics introduced probability theory to treat microscopic physical phenomena of atoms and electrons as uncertain (in modern times, atoms and electrons can also be made even finer and the existence of elementary particles is known). What he was trying to say is, how is it a researcher's attitude to treat any unfamiliar state or variable as a probability? He expressed this by saying that God, who can see into the future, does not play dice with unpredictable future outcomes.
I feel that deep learning is doing something similar to this. Deep learning is not a newly developed method, but a learning method based on neural networks. Examples that have made deep learning famous include “writing literary texts [3]” and “generating images that make the Mona Lisa smile [4]”. Some reports [5] now say that diagnostic imaging in medicine has surpassed humans. A huge amount of good teacher data is necessary in order to give deep learning this kind of cleverness. In other words, giving bad teacher data will result in deep learning with misrecognition. In fact, there has been a lot of research to find weaknesses in deep learning, and in a study called “adversarial examples” (a nuance of “adversarial learning examples” or “devising data to misrecognize”), there are many reports, such as the case where a panda image was recognized as a long-armed ape with high accuracy after a little noise was added [6]. One of the reasons for this misrecognition is that deep learning is based on a black-box approach that attempts to somehow match data, including uncertainties, with teacher data while not analyzing the data itself.
Einstein's words above suggest the idea that studies neglecting analysis are inadequate, and that even for a single particle, a solid model should be constructed to eliminate uncertainty as much as possible.
So, is it absolutely correct to represent all phenomena using a rigorous model? Looking at this closely leads us to Laplace's demon. My interpretation of this phrase is that if all phenomena can be modeled and simulated down to the atomic level, there is no uncertain future and the future can be predicted perfectly. Of course, this idea is now scientifically rejected. However, what level of “rigorousness” should be considered for a rigorous model? I would like to hear the thoughts of readers.