Baidu’s Ernie 3.0 enables tech-beginners to develop creative AI solutions

On April 16, Baidu held the final round of its AI contest, in which participants presented various creations based on Baidu’s Wenxin large-scale pretrained model, Ernie 3.0 Edition.

The event had drew over 2000 contestants, collected over 300 creative applications of Wenxin, spanning across industries such as eduction, healthcare, entertainment, technology, and mental health. In the final round, three projects were selected as winners: “Shuowen”, which helps users interpret traditional Chinese readings, “Tuyan”, creating various styles of literature based on pictures, and “AI essay title generator”, a project by Bilibili content creator Zihao, generating essay titles based on 250-words summaries.

Huanyi is one of the youngest contestants, who entered the final race with a “auto-generation of traditional Chinese-styled copywriting” project. Currently a sophomore at Harbin Institute of Technology majoring in AI, Huanyi used six days to develop his copywriting program, which can generate traditional Chinese-styled texts based on inputted pictures. According to Huanyi, despite being inexperienced, he was able to learn on the fly, thanks to the ease of use of Baidu’s EasyDL platform.

EasyDL, an AI development platform for beginners, offers step-by-step instructions to building and training a model. For your reference, it took me around 5 minutes to create an image recognition model, which was able to identify my mouse with 26.87% accuracy.

“Artificial intelligence and large-scale models should be open to the public, and only when the threshold is so low that everyone can use them conveniently, can there be a real large-scale outbreak of creativity,” said Wu Tian, Vice President of Baidu, while referring to AI as the fourth industry revolution.

Baidu‘s Wenxin large-scale models were first released in March 2019, and quickly grew to the world’s first knowledge-enhanced 100-billion-scale pre-trained language model and largest Chinese-language monolithic model. According to a research paper published by Baidu researchers, Wenxin’s latest ERNIE 3.0 Titan “outperforms the state-of-the-art models on 68 NLP datasets,” including machine reading comprehension, semantic similarity, text classification, closed-book question answering, and others.

The Wenxin series is compiled by bilingual NLP model Plato XL, bilingual scenario OCR pre-trained model VIMER - StrucTexT, and the world’s largest unified generative pre-training framework for bidirectional image-text generation with transformer model. Wenxin’s Ernie 3.0 model has 10 billion parameters, while Ernie 3.0 Titan has up to 260 billion parameters and topped the leaderboard of SuperGLUE’s natural language understanding benchmark. SuperGLUE, offering a series of benchmark tasks for modern language-understanding AI, was jointly launched by Meta, Google’s DeepMind, University of Washington, and New York University in 2019.

A number of Baidu’s own products are powered by the Wenxin large-scale model, including its search engine, Baidu’s smart speakers, and Baidu Maps. Wenxin has accumulated over 60,000 developers, supported hundreds of enterprises and institutions, and has been applied in hundreds of scenarios including compartmentalizing medical records. Traditionally, medical records are handled by doctors who might not have knowledge of all medical departments. On average, only 10% of medical records can be accurately identified and matched by doctors, according to Baidu’s spokesperson at the press conference. Applying Weixin large-scale model can effectively speed up the process, scan, analyze and organize almost 100% of the records.