On weekdays, the world looks like a well-managed railroad: things are running according to laws that we humans understand and can apply concretely. We can accept occasional latenesses; they represent exceptions to those laws. But sometimes we see what the world is experiencing as a multi-car collision on the highway. While this situation follows the same physical or social laws as a normal day, there are too many changing units for us to predict or explain the details of each collision — details that can cause one car to suffer only minor damage and another to explode into a fireball.
Trains moving along the track in an orderly fashion
The characteristics exhibited in a car accident are equally applicable to a walk along a leaf-strewn path on a mundane autumn day. They are both such events, in which the interdependence of countless details prevails over the explanatory power of the rules that determine them. All we can do seems to be to resent or marvel at an outcome as it emerges. Now, our latest paradigm technique, machine learning, may reveal that the everyday world is governed more by chance than by rules. If so, it is because machine learning can step outside the patterns of human perception and instead summarize laws that we cannot understand or apply.
Machine learning concept map
The opaqueness of machine learning systems raises serious concerns about their credibility and tendency to bias. But the fact that they do work may give us a whole new understanding of what the world is and what role we play in it. Machine learning works in a fundamentally different way than traditional programming. In fact, traditional programming is the set of rules we use to make sense of the world based on rules. To take one of the most iconic examples of machine learning: If you are writing software to recognize handwritten numbers, a programmer would traditionally tell the computer that a “1” is made up of a vertical line, and an “8” is made up of a larger circle and a smaller circle above it. and a smaller circle above it, and so on. This approach may work well, but its reliance on the Platonic ideal of handwritten numbers means that the program will misjudge a significant percentage of handwritten numbers. Since the actual numbers come from mortal hands, they cannot be so “perfect”.
A new way of doing machine learning
Machine learning models, on the other hand, know how to learn from examples. To create a machine learning model that can recognize handwritten numbers, developers don’t tell the computer anything we humans know about the shape of numbers. Instead, the developers give it thousands of sample images of handwritten numbers, each different and correctly labeled with the number it represents. The system uses an algorithm to discover statistical relationships between pixels with the same labeled image. A series of pixels on some vertical line will increase the statistical weight of an image as a “1”, decrease the probability of it being a “3”, and so on.
Unknowable, but effective
In real-life machine learning applications, the number of possible answers is in the hundreds of millions, the amount of data to be considered is enormous, and the correlations between data points are so complex that we humans are usually unable to understand them. For example, human metabolism is an extremely complex series of interactions and interdependent effects. So a machine learning system was created that could predict the response of the human system to complex factors, called DeepMetab, and it became a place for doctors, researchers, laypeople and hypochondriacs to ask questions about the human organism and explore relevant ideas. Although we cannot understand how it produces its output, DeepMetab remains the most important source of knowledge about the human body.
The combination of AI and healthcare
As we increasingly rely on machine learning models (MLMs) like DeepMetab that we cannot understand, we may gradually come to accept the following two views: The first view says that in order to obtain useful probabilistic outputs generated by machine learning models, we must often tolerate the disadvantage of inexplicability. The second view is that the difficulty of interpretation is not a drawback, but a real situation. Machine learning models are effective because they are better at reading the world than we are: they generate cognition beyond humans, etc., by counting vast amounts of interconnected data without having to explain to humans how they got to such cognition. Whenever a citizen or regulator cries out in despair because they can’t understand how machine learning works, we get a sense that these models do work.
Concept map for the age of big data
If machine learning models work by giving up on simplifying and explaining complexity with rules that can be understood, then we can feel it in the cries of “It works!” we can feel all the tiny things interacting with each other in their interdependence. And it is these tiny things that are the true essence, rattling in the cosmic sound of harmonious laws. The success of our technology is telling us that the world is a true black box.
Man vs. Machine
From watches to cars, from cameras to thermostats, machine learning is deeply embedded in our daily lives. It is used to recommend videos, try to identify hate speech, guide the movement of vehicles, control the spread of disease, and is critical to mitigating the climate crisis. It’s not perfect and may amplify social biases, but we continue to use it because it works. That machine learning does all of this without applying rules to particular things is surprising and even disturbing. We favor rules over examples so much that we think it’s crazy to have a machine learning system play Go simply by analyzing lots of games and moves without knowing the rules. But that’s how machine learning became the best Go player of all time. In fact, when developers provide a system with data relevant to a domain, they often deliberately hide from it the interrelationships between the data we already know.
Overly specific generalizations?
Now, even people who know a little bit about machine learning can get creeped out by the fact that machine learning models are created by generalizing from data. For example, if a machine learning model for handwritten digit recognition does not generalize from the samples it learns from, it will be a failed model due to overfitting. However, the generalized description of a machine learning model differs from the traditional generalizations we use to explain a particular situation. We like traditional generalizations because (a) we can understand them; (b) they often lead to deductive conclusions; and (c) we can apply them to specific situations. However, (a) generalized descriptions of machine learning models are not always easy to understand; (b) they are statistical, probabilistic, and largely inductive; and (c) we usually cannot apply them unless we run the corresponding machine learning model.
Furthermore, generalized descriptions from multilevel machine learning models may be very specific: for example, vascular patterns in a retinal scan may predict an arthritis attack, but only if 50 numerical indicators are satisfied, which in turn may be influenced by interconnections. It’s like trying to figure out how a car avoids serious damage in a multi-vehicle collision: the vehicle necessarily has to overcome many specific conditions, but such events cannot be reduced to a comprehensible rule, nor can such complex rules be migrated and applied to other events. Or, it’s like a clue in a murder case that indicates the killer but is only valid for that one case.
Wall of clues
Machine learning models do not deny the existence of rules or laws. It simply emphasizes that these rules alone are not enough to understand everything that happens in our complex universe. Serendipitous details interact to make the rules inadequate in their explanatory power, even assuming we could know all the rules in the world. For example, if you know the laws of gravity and air resistance, as well as the mass of a coin and the Earth, and you know how high the coin falls from, you can calculate how long it takes for the coin to hit the ground. This is usually good enough for your practical purposes. But the traditional Western scientific framework overemphasizes rules. To fully apply these rules, we must know every factor that affects the fall, including which pigeons disturb the air currents around the coin and the simultaneous effects exerted by the gravity of distant stars on it. (Did you remember to add the influence of a distant comet?) To apply these laws with complete accuracy, we must have a comprehensive and inaccessible knowledge of the universe like that of the Laplace demon.
The coin flip incident
This is not a criticism of the pursuit of scientific laws or practical science. Science is usually empirically based and adequate for our needs — although the actual achievable precision will make us make certain concessions. But this should make us wonder: why does the Western world regard unexplainable chaotic phenomena as purely epiphenomenal, believing that there are laws beneath them that can explain such phenomena? Why do we ontologically prefer something eternal and unchanging to the constant flow of water or dust?
Rewriting the definition of knowledge
These are common topics in the history of Western philosophy and are far beyond the scope of this paper. But it is undeniable that we are attracted to a world simplified by eternal laws, so that we can understand it and thus predict and control it. At the same time, these simple and wonderful laws hide from us the chaos of particular situations, which are not only determined by the laws themselves, but are also influenced by the state of every other particular situation. But now we have a prediction and control technique that stems directly from the many tiny factors that exist simultaneously and interact with each other as a whole. This technique gives us greater control, but does not enhance our understanding. It succeeds in making us focus on things that are beyond our understanding.
The laws of physics
At the same time, for the same reason, machine learning may break the fascination with certainty as the hallmark of knowledge, because machine learning results in probabilities. Indeed, deriving perfectly certain results from a machine learning model raises doubts about that model. The probability of the output of machine learning is intrinsically inaccurate; the real statement about probability is the ability to correctly predict its chance of being wrong.
The Butterfly Effect
We now have a mechanism that shocks us, some model that operates by drawing information from the many details of interconnectedness in an incomprehensible, subtle network. Perhaps we need not regard those chaotic swirls as mere representations that are not yet well understood. Perhaps the complexity and cognitive difficulty of the interplay of all factors will shake the cognitive foundation of Western science that what is most real is most fixed, most universal, and most knowable.
Schematic of the chaotic solution of the three-body problem
Perhaps we will finally accept that the unimaginably complex connections, accidents and coincidences of simple events are the true face of the world. We will also accept that a brain weighing 1.4 kg is not enough to build a complete knowledge of the world. The brutal unknowability of the world is blurring the boundaries of our understanding. If this is happening, it is because we are hearing more specific, tiny, noisy signals through, for example, machine learning models. These signals are generating useful, surprising, probabilistic knowledge based on the incomprehensible connections between all things.
Author: David Weinberger
"By David Weinberger Translation: Clouds bloom and leaves fall Reviewer: Ms. Zhou π Original link: Learn from machine learning"