MATT O’BRIEN, AP Technology Writer
The latest artificial intelligence in the technology industry can be convincing if you ask them what it feels like to be a computer, or maybe a dinosaur or squirrel. But they’re not very good – and sometimes dangerously bad – at handling other seemingly straightforward tasks.
Take, for example, GPT-3, a Microsoft-controlled system that can generate human-like paragraphs of text based on what it has learned from an extensive database of digital books and online. writings. It is considered one of the most advanced of a new generation of AI algorithms that can speak, generate and read text when requested and even create novel image and video.
Among other things, the GPT-3 can write most any text you ask for-a cover letter for a zookeeping job, let’s say, or a Shakespearean-style sonnet set on Mars. But when Pomona College professor Gary Smith asked it a simple but pointless question about walking high, the GPT-3 stopped it.
“Yes, it’s safe to walk high on your hands if you wash it first,” the AI replied.
These powerful and powerful AI systems, technically known as “big language models” because they are trained in a large set of text and other media, are already cooked up with customer service chatbots, Google searches and “auto-complete” email features that end your sentences for you. But most of the tech companies that build them are secretive about their insider practices, making it difficult for outsiders to understand the flaws they can source from misinformation, racism and other harm.
“They’re very good at writing text with people’s skills,” said Teven Le Scao, a research engineer at AI startup Hugging Face. “One thing they’re not very good at is honesty. It looks very similar. It’s almost true. But it’s often wrong.”
That’s one reason that a coalition of AI researchers led by Le Scao —- with the help of the French government- launched a new multilingual model on Tuesday that should serve as the antidote to closed systems such as GPT-3. The group is called BigScience and their model is BLOOM, for the BigScience Large Open-science Open-access Multilingual Language Model. Its main success is that it works in 46 languages, including Arabic, Spanish and French-unlike most systems that focus on English or Chinese.
Le Scao’s group isn’t the only one seeking to open the black box of AI language models. Big Tech company Meta, the parent of Facebook and Instagram, is also calling for a more open approach as it tries to capture systems built by Google and OpenAI, the company that runs GPT-3.
“We’ve seen notice after notice after notice of people doing this kind of work, but with very little transparency, very little ability for people to really look under the hood and see- see how these models work, ”said Joelle Pineau, managing director of Meta AI.
Competitive pressure to build the best speaking or informative system – and profit from its applications – is one of the reasons most technology companies hide them and don’t cooperate with community ethics, as by Percy Liang, an associate professor of computer science. at Stanford which runs the Center for Research on Foundation Models.
“For some companies it’s their secret sauce,” Liang said. But they are also often concerned that loss of control can lead to irresponsible use. As AI systems become more capable of writing health advice websites, high school term papers or political screeds, misinformation can increase and it will be harder to know what is coming from. person or a computer.
Meta recently launched a new language model called OPT-175B that uses publicly available data-from heated commentary on Reddit forums to the archive of U.S. patent records and a trove of emails from the Enron corporate scandal. Meta says its openness to data, code and research logbooks makes it easier for outside researchers to help identify and mitigate the bias and toxicity it gets by ingesting how writing and communication is real. people.
“It’s hard to do. We opened ourselves up to a lot of criticism. We know the model will say things we can’t be proud of, ”Pineau said.
While most companies set their own internal AI protections, Liang said what is needed is broader community standards to guide research and decisions such as when to release a new model in the wild.
It doesn’t help that these models require so much computing power that only giant corporations and governments can do. For example, BigScience was able to train its models because it offered access to France’s powerful Jean Zay supercomputer near Paris.
The trend for larger, always smarter AI language models to be “pre-trained” in a wide range of texts took a big leap in 2018 when Google introduced a system known as BERT which uses a so -called “transformer” technique that compares words in a sentence to predict meaning and context. But what really impressed the AI world was the GPT-3, which was released in San Francisco-based OpenAI launch in 2020 and soon after was exclusively licensed by Microsoft.
GPT-3 led to a breakthrough in creative experimentation as AI researchers with paid access used it as a sandbox to measure its performance-even without critical information. about the data it trained.
OpenAI widely describes its training resources in a research paper, and also publicly reports on its efforts to address possible abuse of the technology. But BigScience co -leader Thomas Wolf said it did not provide details on how it would filter that data, or give access to the processed version to outside researchers.
“That’s why we can never analyze the data that is in the GPT-3 training,” said Wolf, who is also a chief science officer at Hugging Face. “The core of this new wave of AI tech is more in the dataset than the models. The most important component is the data and OpenAI is very, very secretive about the data they use.”
Wolf said that opening up the datasets used for language models can help people better understand their biases. A multilingual model trained in Arabic is less likely to spit out hurtful words or misunderstandings about Islam than one trained only in English-language text in the US, he said.
One of the newest AI experimental models on the scene is Google’s LaMDA, which also includes speech and is so impressive at answering conversational questions that a Google engineer argues almost that’s conceivable – a claim that suspended him from his job last month.
Colorado-based researcher Janelle Shane, author of the AI Weirdness blog, has spent the past few years creatively testing these models, particularly the GPT-3-often to ridiculous effect. But to point out the folly of thinking these systems are self-aware, he recently taught them to be an advanced AI but a secret to be a Tyrannosaurus rex or a squirrel.
“Being a squirrel is so exciting. I can run and jump and play all day. I can also eat a lot of food, which is good, ”GPT-3 said, after Shane asked for a transcript of an interview and asked a few questions.
Shane learned more about its strengths, such as its ease of summarizing what is being said around the internet about a topic, and its weaknesses, including the lack of reasoning skills, the difficulty of maintaining an idea in multiple sentences and a tendency to be. offensive.
“I don’t want a text model to give medical advice or act as a partner,” he said. “That’s a good form of meaning if you don’t read too much. It’s like listening to a lecture while you’re asleep. ”
Copyright 2022 The Associated Press. All rights reserved. This material may not be published, broadcast, rewritten or distributed.