Models for natural language processing use statistics to collect a lot of information about word meanings.
In “Through the Looking Glass,” Humpty Dumpty said sarcastically, “When I use a word, it means what I mean by it—no more and no less.” Alice replied, “The question is whether you can make words mean very different things.”
Word meanings have long been the subject of research. To understand its meaning, the human mind must organize a complex network of flexible, detailed information.
Now, a new issue with the meaning of the word has come to light. Researchers are looking at whether machines with artificial intelligence can imitate human thought processes and understand the same words. Researchers from UCLA, MIT, and the National Institutes of Health recently published a study that answers that question.
The study, published in the journal Human Nature Nature, shows that artificial intelligence systems can acquire more complex word meanings. Researchers have also found a simple method to gain access to this sophisticated information. They discovered that the AI system they looked at represented word meanings in a way similar to human judgment.
The AI system explored by the authors has been widely used to analyze word meaning in the last decade. It extracts word definitions by “reading” vast amounts of material on the internet, which contains tens of billions of words.
When words frequently occur together — “table” and “chair,” for example — the system knows that their meanings are related. And when pairs of words occur rarely – like “table” and “planet,” – it turns out that they have very different meanings.
That approach seems like a logical starting point, but think how much better people would understand the world if the only way to understand meaning was to count how often words occur next to each other. one, without the ability to interact with other people and our environment.
Idan Blank, a UCLA assistant professor of psychology and linguistics, and the study’s co-lead author, said the researchers set out to find out what the system knows about familiar words. of it, and what kind of “common sense” is there in it.
Before starting the research, Blank said, the system appeared to have one major limitation: “As far as the system is concerned, every two words have only one numerical value that represents how similar they are.”
In contrast, human knowledge is more detailed and complex.
“Consider our knowledge of dolphins and alligators,” Blank said. “If we compare the two on a scale of size, from ‘small’ to ‘large,’ they are quite similar. In terms of their intelligence, they are quite different. In terms of the danger they pose to us, on a scale from ‘safe’ to ‘dangerous,’ they vary greatly. So the meaning of a word depends on the context.
“We want to ask whether this system really recognizes these subtle differences — whether the idea of similarity is transformed in the same way for people.”
To find out, the authors developed a technique they call “semantic projection.” One can draw a line between the model’s representations of the words “big” and “small,” for example, and see where the representations of different animals fall on that line.
Using that method, the scientists studied 52 word groups to see if the system could learn to sort out meanings — like judging animals by their size or what it’s risk to people, or classify US states by time or overall wealth.
Among other word groups are terms related to clothing, professions, sports, mythological creatures, and first names. Each category is given multiple contexts or dimensions — size, danger, intelligence, age, and speed, for example.
The researchers found that, in many things and contexts, their approach is very similar to human intuition. (To make that comparison, the researchers also asked cohorts of 25 people each to make similar assessments about each of the 52-word groups.)
Remarkably, the system learned to understand that the names “Betty” and “George” are synonymous in the term relatively “old,” but they represent different genders. And that “weightlifting” and “fencing” both happen mostly indoors, but differ in terms of how much skill they require.
“It’s a pretty simple method and completely intuitive,” Blank said. “The line between ‘big’ and ‘small’ is like a mental scale, and we put animals on that scale.”
Blank said he never expected the technique to work but was happy when it did.
“It turns out that this machine learning system is much smarter than we thought; it contains very complex forms of knowledge, and this knowledge is organized in a very intuitive structure,” he said. “Just by keeping track of which words go together in the language, you can learn a lot about the world.”
Reference: “Semantic projection recovers rich human knowledge about multiple object features from word embeddings” by Gabriel Grand, Idan Asher Blank, Francisco Pereira, and Evelina Fedorenko, 14 April 2022, Natural Human Behavior.
The study was funded by the Office of the Director of National Intelligence, Intelligence Advanced Research Projects Activity through the Air Force Research Laboratory.