WOWZA. This could be the Turing-NLG of GPT-3.In the race to build the underlying technologies that can power the next wave of AI revolution, a Chinese lab just toppled OpenAI, the venerated US-based research lab, in terms of who can train a gigantic deep learning model with the most training parameters--as for whether or not there is a race, at least ranking members of the lab believe so.
Unlike conventional deep learning models that are usually task-specific, Wudao is a multi-modal model trained to tackle both text and image, two dramatically different sets of problems. At BAAI’s annual academic conference on Tuesday, the institution demonstrated Wudao performing tasks such as natural language processing, text generation, image recognition, image generation, etc.
The model is capable of writing poems and couplets in the traditional Chinese styles, answer questions, write essays, generate alt text for images, and generate corresponding images from natural language description with a decent level of photorealism. It is even able to power “virtual idols”, with the help of XiaoIce, a Chinese company spun off of Microsoft--so there can be voice support too, in addition to text and image.
Very, very fascinating. I'd love to see it in action, but I wouldn't be surprised if it's all in Chinese and thus not understandable to me.
That said, it's not blowing my mind just yet.
For starters... Zero third-party confirmation. I'd love to see it in action— as would anyone else.
Second, it's a mixture-of-experts model, not a dense one like GPT-3. Think back to Google's 1.6 trillion parameter transformer, which is barely if at all better than GPT-3. More parameters, but not more compute, and thus it's not going to set any records.
This one is certainly for more impressive than Google's Switch Transformer, without question. Just because it is claimed to possess multimodality with reasonable abilities in that area. It's more interesting than GPT-3 because of that in raw terms. But the parameter wars are probably blinding way too many tech bloggers into saying "GPT-3 has been totally trumped" when, until we get more info, there's absolutely no reason to believe it's actually qualitatively superior in any regard except pure generalization (which, to be fair, is important).
Yeah, remember the Bit Wars of the 90s and everyone talking about bits despite not understanding what they were? The current obsession with parameter count is a little bit like that. A bigger number is better, but if it's not optimized and efficient, you're going to have an Atari Jaguar rather than a Dreamcast.