Meta’s Challenge to OpenAI—Give Away a Massive Language Model
Meta is giving away some of the family jewels: That’s the gist of an announcement from the company formerly known as Facebook this week. In a blog post on the Meta AI site, the company’s researchers announced that they’ve created a massive and powerful language AI system and are making it available free to all researchers in the artificial-intelligence community. Meta describes the move as an effort to democratize access to a powerful kind of AI—but some argue that not very many researchers will actually benefit from this largesse. And even as these models become more accessible to researchers, many questions remain about the path to commercial use.
Large language models are one of the hottest things in AI right now. Models like OpenAI’s GPT-3 can generate remarkably fluid and coherent text in just about any format or style: They can write convincing news articles, legal summaries, poems, and advertising copy, or hold up their end of conversation as customer-service chatbots or video-game characters. GPT-3, which broke the mold with its 175 billion parameters, is available to academic and commercial entities only via OpenAI’s application and vetting process.
Meta’s Open Pretrained Transformer (known as OPT-175B) matches GPT-3 with 175 billion parameters of its own. Meta is offering the research community not only the model itself, but also its codebase and extensive notes and logbooks about the training process. The model was trained on 800 gigabytes of data from five publicly available data sets, which are described in the “data card” that accompanies a technical paper posted by the Meta researchers to the ArXiv online preprint server.
Joelle Pineau, director of Meta AI Research Labs, tells IEEE Spectrum that she expects researchers to make use of this treasure trove in several ways. “The first thing I expect [researchers] to do is to use it to build other types of language-based systems, whether it’s machine translation, a chatbot, something that completes text—all of these require this kind of state-of-the-art language model,” she says. Rather than training their own language models from scratch, Pineau says, they can build applications and run them “on a relatively modest compute budget.”
The second thing she expects researchers to do, Pineau says, is “pull it apart” to examine its flaws and limitations. Large language models like GPT-3 are famously capable of generating toxic language full of stereotypes and harmful bias; that troubling tendency is a result of training data that includes hateful language found in Reddit forums and the like. In their technical paper, Meta’s researchers describe how they evaluated the model on benchmarks related to hate speech, stereotypes, and toxic-content generation, but Pineau says “there’s so much more to be done.” She adds that the scrutiny should be done “by community researchers, not inside closed research labs.”
The paper states that “we still believe this technology is premature for commercial deployment,” and says that by releasing the model with a noncommercial license, Meta hopes to facilitate the development of guidelines for responsible use of large language models “before broader commercial deployment occurs.”
Within Meta, Pineau acknowledges that there’s a lot of interest in using OPT-175B commercially. “We have a lot of groups that deal with text,” she notes, that might want to build a specialized application on top of the language model. It’s easy to imagine product teams salivating over the technology: It could power content-moderation tools or text translation, could help suggest relevant content, or could generate text for the creatures of the metaverse, should it truly come to pass.
There have been other efforts to make an open-source language model, most notably from EleutherAI, an association that has released a 20-billion-parameter model in February. Connor Leahy, one of the founders of EleutherAI and founder of an AI startup called Conjecture, calls Meta’s move a good step for open science. “Especially the release of their logbook is unprecedented (to my knowledge) and very welcome,” he tells Spectrum in an email. But he notes that Meta’s conditional release, making the model available only on request and with a noncommercial license, “falls short of truly open.” EleutherAI doesn’t comment on its plans, but Leahy says the group will continue working on its own language AI, and adds that OPT-175B will be helpful for some of its research. “Open research is synergistic in that way,” he says. [READ MORE]