Baidu’s AI Assistant DuerOS Seeks to Break Circle of General-Purpose Use

BEIJING (PingWest) — Jing Kun, Baidu's corporate vice president and general manager of the company’s Smart Living Group (SLG), introduced recent developments of Xiaodu, Baidu’s smart speaker product line powered by Baidu’s AI voice assistant DuerOS, and shared his thoughts on the prospects for smart voice assistant at a press conference on Wednesday, July 8 in Beijing.

“The smart voice assistant sector is at the dawn of an explosive growth,” Jing said, adding that he thought the core concept for the next phase of development was "breaking the circle of simply general-purpose use."

There are two ways to do so, Jing further explained. One is specifying features to targeting groups of users to satisfy their needs, such as kids and aged people. Xiaodu pre-installed more education apps and cartoon apps for kids to help with their learning and offer proper entertainment sources. For aged people, Xiaodu focused on more precise automatic speech recognition (ASR) features.

The other way is to extending application scenarios of Xiaodu beyond home use. Conventionally, smart speakers and smart screens are mainly used at home, forming a IOT-based ecosystem, but Jing said it is only part of the whole DuerOS ecosystem. Xiaodu devices are also equipped in many hotel groups, including Shimao Hotels, Huazhu and Hotels Group.

Another example is Xiaodu’s partnership with Xiaotiancai watch phone designated specially for children to provide them more intelligent education and living guides.

Baidu will also launch a mobile devices equipped with DuerOS in the second half of 2020.

Other part of ecosystem has something to do with partnerships with content and services providers, such as short video giants Douyin and Kuaishou, video platform Bilibili, and fitness app Keep. These apps and platforms would offer Xiaodu specific contents to the device users.

Jing said the SLG has also stepping up efforts to conduct research on speech interaction technologies. A breakthrough is made that Xiaodu is able to synthesis speech using parents’ voices by only requiring 10-20 sentences of audio. This feature allows Xiaodu to read stories to children without having parents to stay up late, or substitute parents when they are on trips.