Pulitzer Prize Book Godel, Escher, Bach 1979 inspired a large number of computer scientists, but few like Melanie MitchellAfter reading this 777-page tome, Mitchell, a high school math teacher in New York, decided that she “needed” to work in the field of artificial intelligence. She quickly found the author of the book, artificial intelligence researcher Douglas Hofstadter, and persuaded him to give her an internship. At that time, she only took a few computer science courses, but he seemed to be impressed by her courage and didn’t care about her education.
Mitchell prepared a “last minute” graduate application and joined Hofstadter’s new laboratory at the University of Michigan in Ann Arbor.The two worked closely for the next six years Copycat, A computer program, it, In the words of its co-creator, Aims to “discover insightful analogies and do so in a psychologically realistic way.”
Analog imitator come out Between the simple letter patterns, it is similar to the analogy in standardized testing. An example: “If the string’abc’ becomes the string’abd’, then what will the string’pqrs’ become?” Hofstadter and Mitchell believe that the cognitive process of understanding analogy-how humans have similar thoughts, perceptions and experiences Establishing abstract connections between them is essential for unlocking human-like artificial intelligence.
Mitchell believes that analogy can go deeper than test pattern matching. “This is to understand the essence of one situation by mapping it to another already understood situation,” she said. “If you tell me a story, I say,’Oh, the same thing happened to me,’ in fact, what happened to me didn’t happen to you, but I can make a mapping and show it It looks very similar. This is something we humans have been doing without even realizing that we are doing it. We have been swimming in this analogy ocean.”
As the Davis Complexity Professor at the Santa Fe Institute, Mitchell expanded her research beyond machine learning. She currently leads SFI The basis of intelligence in natural and artificial systems The project will hold a series of interdisciplinary seminars next year to study biological evolution, collective behavior (such as the behavior of social insects such as ants) and the body’s contribution to intelligence. But in her work, the role of analogy is more prominent than ever, especially in the field of artificial intelligence-the major advances in this field in the past decade have been driven by deep neural networks, which mimics the mammalian brain. The technique of hierarchical organization of neurons.
“Today’s most advanced neural networks are very good at certain tasks,” she said, “but they are very bad at transferring the knowledge they learned in one situation to another”-essentially analogous.
Quanta Discussed with Mitchell how artificial intelligence makes analogies, what the field knows about them so far, and where to go next. The interview has been condensed and edited for clarity.
Why is analogy so important to artificial intelligence?
This is a basic thinking mechanism that will help AI achieve the goals we want. Some people say that being able to predict the future is the key to artificial intelligence, or being able to have common sense, or being able to retrieve memories that are useful in the current situation. But in these things, analogy is very important.
For example, we want self-driving cars, but one of the problems is that if they face a situation that is slightly different from the training they have received, they don’t know what to do. How do we humans know what to do in situations that we have not encountered before? Well, let’s use our previous experience as an analogy. This is what we need these artificial intelligence systems to do in the real world.
But you also written This analogy is “an understudied field in artificial intelligence.” If it is so basic, why is it so?
One of the reasons why people did not study it in depth was that they did not realize its fundamental importance to cognition. Pay attention to the logic and programming in the rules of behavior-this is how early artificial intelligence worked. Recently, people have focused on learning from a large number of examples, and then assume that you will be able to generalize things you haven’t seen before using only the statistics of what you have learned. They hope that the ability to generalize and abstract can be generated from statistical data, but it is not as effective as people hoped.
For example, you can show millions of pictures of bridges to a deep neural network, and it may recognize new pictures of bridges on the river or other things. But it can never abstract the concept of “bridge” as our concept of bridging the gender gap. It turns out that these networks will not learn how to abstract. Something is missing. People are just trying to solve this problem now.
Melanie Mitchell is the Davis Professor of Complexity at the Santa Fe Institute and has been dedicated to digital thinking for decades. She said that artificial intelligence will never be truly “intelligent” unless they can do something unique to humans: make analogies. Image source: Emily Buder/Quanta Magazine; Gabriella Max Quanta Magazine
They will never learn to abstract?
With new methods, such as meta-learning, machines can better “learn to learn”.Or self-supervised learning, where the system is like GPT-3 Learning to fill in sentences with a missing word makes it very, very convincing to generate language. Some people will argue that such a system will eventually learn to accomplish this abstract task with enough data. But I don’t think so.
You have described this restriction as “Barriers to Meaning” — Artificial intelligence systems can simulate and understand under certain conditions, but become fragile and unreliable outside of them. Why do you think that analogy is our way to solve this problem?
My feeling is that it takes meaning to solve the fragility problem. This is what ultimately leads to the fragility problem: these systems do not understand, in any human sense, the data they are processing.
The word “understanding” is one of these suitcase words, and no one agrees with its true meaning-almost like a placeholder for a psychological phenomenon that we can’t explain yet. But I think this mechanism of abstraction and analogy is the key to what we humans call understanding. It is a mechanism by which understanding happens. We can in some way map what we already know to new things.
So analogy is a way for organisms to maintain cognitive flexibility instead of behaving like robots?
I think to some extent, yes. The analogy is not just what we humans do. Some animals are a bit like robots, but other species are able to acquire previous experiences and map them to new experiences. Perhaps this is a way of applying a series of intelligence to different kinds of living systems: how far can you make a more abstract analogy?
One theory of why humans have this special intelligence is that it is because we are too social. One of the most important things you have to do is to simulate other people’s ideas, understand their goals and predict what they will do. This is what you do by analogy. You can put yourself in the position of another person and map your own ideas to their position. This “theory of mind” is a topic that people in the field of artificial intelligence have been talking about. It is essentially a way of analogy.
Your Copycat system is an early attempt to do this on a computer. Is there anyone else?
“Structural Mapping” Work in the field of artificial intelligence focuses on logically based representations of situations and mapping between them.Ken Forbus and others used the famous metaphor [made by Ernest Rutherford in 1911] From the solar system to the atom.They will have a set of sentences [in a formal notation called predicate logic] Describe these two situations, they are not based on the content of the sentence, but according to their structure to map them. This idea is very powerful, and I think it is right. When humans try to understand the similarities, we focus more on relationships rather than specific objects.
Why didn’t these methods take off?
The entire learning problem is largely excluded from these systems. Structural mapping will map these very, very human-meaning words-such as “the earth revolves around the sun” and “the electron revolves around the nucleus”-and maps them to each other, but there is no internal model “revolving”. It is just a sign. Copycat works well with alphabetic strings, but what we lack is the answer to the question of how to extend it and generalize it to the domains we really care about?
As we all know, the scalability of deep learning is very good. Is it more effective in generating meaningful analogies?
There is a view that a deep neural network exerts this magic between its input layer and output layer. If they can recognize different kinds of dogs better than humans—what are they—they should be able to solve these very simple analogy problems. So people will create a large data set to train and test their neural network, and publish a paper saying: “Our method gets 80% accuracy in this test.” Others will say, “Wait, wait, Your data set has some strange statistical properties that allow the machine to learn how to solve them without being able to generalize. This is a new data set, and your machine does a terrible job in this area, but ours does a good job.” This Keep going.
The problem is, if you have to train it on thousands of examples, you have already lost. This is not all abstract. This is all about what people call “small sample learning” in machine learning, which means you can learn from very few examples. This is the true use of abstraction.
So what is missing? Why can’t we glue these methods together like many Lego bricks?
We have no instruction manual to tell you how to do this! But I do think we must put them together. This is the frontier of this research: What are the key insights of all these things, and how do they complement each other?
Many people are very interested Abstraction and Reasoning Corpus [ARC], This is a very challenging small sample learning task, around “Core Knowledge” Humans are inherently born.We know that the world should be parsed into objects, we know some knowledge about space geometry, such as something above or below something [else]. In ARC, a color grid will become another color grid, humans can use this core knowledge to describe-for example, “all squares of one color are to the right, and all squares are to the right of other colors. Left.” It gives you an example of this, and then asks you to do the same with another color grid.
I think this is an analogy challenge. You are trying to find some kind of abstract description of the change from one image to a new image, but you cannot learn any strange statistical correlations because you only have two examples. How to make machines use the core knowledge that babies have for learning and reasoning-this is something that any system I have mentioned so far cannot do. This is why none of them can handle this ARC data set. This is a holy grail.
If babies are born with this kind of “core knowledge”, does that mean that artificial intelligence needs a body like ours for this kind of analogy?
This is a problem worth millions of dollars. This is a very controversial issue, and the AI community has not yet reached a consensus.My intuition is, yes, we won’t be able to make an analogy [in AI] There is no manifestation. Having a body may be essential, because some of these visual problems require you to think about them in three dimensions. For me, it has to do with living in this world and walking around, understanding the relationship of things in space. I don’t know whether the machine has to go through that stage. I think it might.
Reprinted with authorization Quanta Magazine, Editorially independent publication Simmons Foundation Its mission is to improve the public’s understanding of science by covering research developments and trends in mathematics, physics, and life sciences.