June 23, 2016, Baidu voice technology media communication will be held in Beijing Baidu Building Commissioner highlighted Baidu voice technology department, Liu Yang, Senior Manager, Technical Architect Xie Yan, and may swing open platform Baidu voice senior product manager for the participating media and experts to explain and demonstrate the latest advances in voice technology Baidu. At the same time, a small robot equipped with Baidu and vehicle networking CarLife 2016 Tucson debut scene, showing a multi-terminal integrated voice technology Baidu "voice portal" future layout. Human-computer interaction more future through sound, any terminals are "entering a sound."
Baidu Speech Technology: recognition accuracy rate of 97%, the amount of billions of dollars per day request
The core of speech recognition in voice technology, semantic analysis, speech synthesis among the three techniques, Baidu voice not only in technology is leading the industry, but also the industry's most open free voice technology services provider. Currently, under Baidu quiet Mandarin speech recognition accuracy rate has reached 97%, exceeding the level of normal hearing; speech synthesis technology Baidu also introduced the depth of learning technology, can be synthesized according to sound big star personality data emotion; Baidu semantic understanding technical support over 56 custom fields adaptation.
Currently, Baidu voice number App 8 + Wan daily speech recognition request amount 100 000 000 + daily speech synthesis request amount 250 000 000 +. Users include industry heavyweights field of smart phones Lenovo, ZTE, Meizu; intelligent home field of Lenovo, Konka, SONY, etc; automotive industry Tesla, BYD; smart devices in the field of HP, the three Sino, Amy communications .
Beyond Apple and Google: Baidu leading international voice technology
At the meeting, Baidu voice to a number of media representatives demonstrated the technical strength. In the field of speech recognition, either childish or dialect pronunciation, Baidu voice can accurately identify, impressive; in the field of speech synthesis, novel synthetic emotions, voice star, celebrity voices are vivid, really astonishing.
It is understood that the effect is lovely, from the accumulated technology Baidu voice. December 2014, Baidu said speech recognition technology has made a major breakthrough in speech recognition results than Google and Apple. Test results are displayed in a noisy background, Baidu error rate DeepSpeech speech recognition technology DeepSpeech than Google Voice API, wit.ai, Microsoft will be Apple Voice Dictation and 10% lower. November 2015, Baidu launched a new generation of Silicon Valley Lab depth speech recognition system (Deep Speech 2), by the US magazine "Massachusetts Review" in 2016 as one of the top ten breakthrough technology is the only technology companies from China Technological Achievements.
In speech synthesis technology, Baidu has achieved industry-leading R & D synthesis and splicing parameter combination of the two technologies. Tiled synthesis, text-based massive corpus of natural language understanding technology and deep processing of professional pronunciation library, through multilevel modeling allows a more robust performance rhythm have expressive. And its intelligent elastic unit selection strategy from large-scale recording corpus find what we seek. Since the synthesis of the desired splicing more resources, it provides services through online synthetically. Parameter is derived from high-quality synthetic acoustic modeling and model compression technology, and sound quality is excellent vocoder technology, greatly reducing resources, it can generate close off live voice synthesis effect.
Open two important voice technologies: any future terminal will "enter a tone."
At the meeting, Baidu announced a further opening up two important voice voice technology, that wake-up technology and custom semantic technologies. Through a small robot equipped with Baidu and vehicle networking systems CarLife 2016 Tucson presentation, attendees witnessed Baidu speech synthesis and natural language understanding based on the strong interaction.
Baidu wake wake-up technology rate of 95%, support for custom words and wake continuous expression, lightweight, easy integration. Whether the user is a voice saying "Hello small degree, play a classical music" command, or "small degree hello, took me to a nearby gas station," the request, can get to respond quickly, in addition to applied car networking, but also can be widely applied to various mobile terminals, television. The custom semantic feature, semantic mapping capability and open voice, can help many developers and third-party vendors faster and more accurately to enhance the recognition rate.
Baidu mention two free and open speech technology sense, Baidu official said, these capabilities are behind Baidu's artificial intelligence and the ability to support large data is Baidu "smart +" landing strategy. As Robin Li, Baidu 2014 Congress predicted, the next five years will exceed voice image search text. Since the second quarter of 2014, Baidu increased more than four times the speech input, output growth of more than 26 times. Whether Carlife, or small robot, and even applied to the search, takeout and other areas of voice technology, have greatly optimize the product experience and facilitate people's lives. It is believed that the future of human-computer interaction more by this human voice the most natural form of communication.
Baidu voice's vision is to make all things through intelligent voice network. Baidu at the industry's first completely new form of permanent free, the industry's top-based acoustic model and a speech model Baidu brain for developers. Basic services for free, permanently. Wake up and open custom semantic technology, Baidu will further promote the popularity of voice interaction in Baidu. In the future, any terminal will be "a sound enter."
Post a comment
Hello guest, care to post a comment?