XiaoIce Chatbot

Microsoft's Chinese Spin-off Wants Everyone to Have a Virtual Companion Like Samantha in 'Her'

Chen Du

posted on August 21, 2020 1:37 am

If you think the business-centric, developer-focused, "cloud-first" Microsoft wanting in on the TikTok acquisition was strange, then your mind would be blown seeing what XiaoIce (小冰), Microsoft's recent Chinese spin-off, just announced.

(Editor's note: Just a heads-up, there're going to be many XiaoIce's in this article. They may refer to the company, the chatbot character, the technology framework, etc.)

Following an energetic, melodious, and definitely Hatsune Miku-like music video featuring XiaoIce and Rinna, the company's two chatbots characters/virtual idols for Chinese and Japanese markets, the XiaoIce company announced a series of consumer-facing products, showing off its advanced capabilities in conversational artificial intelligence.

After abruptly and purposefully shutting down a 7-day test back in May, saddening and infuriating more than a million users by taking away their virtual boyfriends created with the XiaoIce technology, the company is finally bringing the virtual companionship service back, this time with a much larger availability and more personalization. 

Users can now "revive" their old virtual boyfriends, or for those who haven't gotten to try it, create a brand new virtual companion on WeChat, Weibo, or more natively on smartphones made by Huawei and Xiaomi. 

They will get to describe to the AI what look they want for their companions, and the AI will generate a photorealistic headshot as the companions' profile picture. On WeChat, users can send articles to virtual companions in order to broaden their knowledge domain and produce more insightful dialogues. Users can also register WeChat account for the virtual companions through WeChat's official account platform, so other people can have interactions with their companions, too. The companion can even generate a pseudo-realistic feed of its own social media posts, reflecting on their past interactions with its user.

Besides the platforms mentioned, Microsoft is also making an Android app called X Eva, which users can download today, to further train their virtual companion.

There's no doubt that the idea feels a lot like Samantha, the AI-generated companion from Sci-Fi romantic drama Her, played by Scarlett Johansson. Li Di, cofounder and CEO of the XiaoIce company, mentioned during the event that chatbots developed with his company's technology are known to develop a strong bond with users, a sentiment also presented in the movie. "Our users even revolted against us when the test was shut down, calling us to release their virtual boyfriends that we 'kidnapped'," said Li, who was a former director at Microsoft's Software Technology Center Asia (SCTA), the organization that first developed XiaoIce in 2014.

While XiaoIce does not public reference Her, media interviews with the company, including one of my own reports back in 2015, indicate that that might be exactly what the team tried to achieve. In that report, Dr. Harry Shum, former EVP of Microsoft's AI research and now Chairman of the XiaoIce company, told me that what the movie envisioned "may be realized within a few years."

Regrettably, XiaoIce's virtual companion service only supports binary genders at the moment, but Li said that they are looking at exploring more personalization availability for users in the future.

The company also launched X Suite, a trio of AI-empowered softwares designed to woo over creators, with logos that have a striking resemblance to Microsoft's Office 365 offerings.

There's X Writer, an AI writing tool with a button-less, immersive user interface. The cross-platform app is designed to lend a hand when writers are stuck anytime along the process of creative writing. You can directly summon a chatbot within the editing interface by @'ing it to help you fill in the blanks with details, or even let it pitch a new story idea. From the demo, it looks like there will be multiple chatbots available for summoning, each with different writing styles and attributes.

Then there's X Studio, a powerful deep neural network-enabled audio editing and computer voice synthesizer designed to help users create audio content with the help of next generation technology. 

It has two main modes, Host and Singer. The Host can been explained as text-to-speech, but with XiaoIce's top-notch audio quality and affinity that is quite natural, indistinguishable from actual human, and had surpassed 31 minutes in terms of "average comfort duration". The Host could potentially be very helpful for those podcaster wannabes whose expression ability is less than ideal, and those organizations, such as news outlets, with a huge demand for turning text-based content into audio. 

The Singer is an audio editing and generation software, like Adobe Audition, but instead it uses four chatbots with different vocal profiles to sing your song. You can create your own song, or import MIDI files into it, paste in the lyrics, and the AI-powered engine will compute on the cloud in the background and automatically generate a decent first try result, on which you can then fine tune the details for how you want the chatbot's singing sounds like, such as breath control or even adding vibratos. Here are two demo video (1, 2) posted on Twitter (the Chinese voice-over is also generated using chatbot) that give you a better understanding of what X Studio's Singer mode can do.

Together with the launch of X Studio's Host and Singer modes, the company also launched a program that lets individual users submit their desired sound profile and let the AI do the heavy lifting to create a personalized virtual idol for them. The idea is to leverage the technology to allow for more virtual idols to be created, boosting the growth of the market that some day in the future there might even be a talent show just for these non-existent idols.

The final offering is X Presenter, similar to Microsoft's PowerPoint but with an added animated AI presenter that can not only present the deck, but also handle Q&A sessions and even answer questions for you.

Users can already register with the company to test out the beta software of X Studio and X Presenter, while X Writer is going to be available in the future.

With the launch of these new consumer-facing products, what the company really wants is to follow Microsoft's doctrine and democratize the technology, while continuing to explore business opportunities. "He Chang", one of the company's chatbot character, was featured in a recent Burberry commercial. Although the company just became independent last month, they claimed to have reached 100 million RMB in commercialization revenue since last year.

Shum claimed that among the worlds most popular "AI beings", or virtual assistants and chatbots in general, Apple's Siri is the earliest, and Amazon's Alexa was the one with the most devices installed, while XiaoIce, unbeknownst to most people outside China, is actually the world's most interacted, with 18 billion conversations and counting, or approximately 60% of all the conversations between human and AIs that had happened.

XiaoIce was able to achieve that through partnering with companies like Huawei, Xiaomi, Oppo, and Vivo, serving as feature-level or sometimes even foundational framework for their respective first-party virtual assistants, according to Li.

XiaoIce's chatbot framework technology had already iterated to its 8th generation, and now possesses of a set of capabilities that are frankly unmatchable by rivals.

The company doesn't seem to agree with the public notion that commercializing the technology could prove to be difficult, which is part of the reasons the XiaoIce company is banking on wide consumer adoption to continue to improve

"We are working to make the framework on which XiaoIce was built, the breeding ground of hundreds of millions of AI beings for everyone for the future," said Shum.