Duplex: how Google has created an AI capable of conducting a natural conversation
At the Google I / O 2018, the California company has made an incredible demonstration of an AI able to make calls for you. How does it work and what are its true capabilities?
If you've followed the news in recent days, you certainly have not missed out on Google Duplex. This is probably the most prominent announcement of the Google I / O, as the artificial intelligence presented at the conference seems to have jumped in time.
The future is now?
She is now able to communicate in natural language ... so much so that she even has the ability to maintain a telephone conversation with a real person. Hard to believe ? However, in the demonstration, it is true recorded beforehand, Google Duplex managed to make an appointment alone at a hairdresser, after negotiating the date and time. The conversation, fluid and full of varied intonations, seems far removed from what we usually hear when we are dealing with a synthetic voice.
But how is this possible (if it is) and what can we expect to do with Google Duplex?
The challenge of voice and context
First puzzle: the voice. As Google explains, imitating natural language and behavior is not easy. Because a conversation is also made of silences, interrupted sentences, repeated, various questions, small "hmmm mmm" to mean that we are listening, not to mention the variations of intonation, accents, and Expressions specific to everyone ... Modeling all this is extremely complex.
Then comes another problem, that of understanding. Anyone who has used a vocal assistant like Siri, knows it well: to be understood by an AI, one is forced to simplify his sentences ... And more than a conversation, it is mostly a question / response and often laborious that must be conducted. It is difficult for an AI to keep pace and take into account the context, the central element of a conversation between two humans.
Specialize to give the exchange
In his blog post, Google explains how his engineers arrived at the bluffing result of the Google I / O keynote. First point: to interact with a hairdresser or a restaurant owner, Google Duplex is specially trained to recognize the standard phrases, questions and words used, their meaning and the context of use. In other words, Duplex is not able to freely discuss everything with this level of precision and understanding.
To offer a natural conversation in a specific theme, Duplex works with a network of recurrent neurons to which anonymized telephone communication sets have been submitted. It was these data that were used in his training and that allowed him to understand the meaning of the words used according to a given context.
During training, human interlocutors were screened for automatic speech recognition technology. The data was then transmitted to the neural network with information from the audio track, history, and other parameters of the conversation.
In the end, for each domain (hairdressing, pedicure, etc.), Google created a model of understanding and then merged it with the common elements learned in each sector.
Two tools for one voice
Once the AI understands what it has been told, it must respond. This is where two speech synthesis tools come in. The first is a concatenative text to speech (TTS) engine, a text read by a synthetic voice. The second is also a TTS engine, consisting of two elements presented by Google last December: Tacotron 2 and Wavenet. Schematically, the first states the sentence and the second controls the intonation according to the circumstances.
Finally, to make the conversation even more natural, Google has integrated disfluences, these little "hmm" that we loose without realizing it to mean that we are always listening. Small signs, conventions that prevent the interlocutor wonders if we are still online. What's more humane than this little laziness?
The experience will improve and become richer
Phew, if ever Google Duplex is unable to answer a complex answer, it signals an operator to take the torch. What to avoid blunders and embarrassing situations!
But over time, Duplex should move forward and expand its area of expertise. To train in a new field, system progress is monitored as new data sets are submitted in real time. Once a satisfactory level of quality is achieved, human trainer monitoring is stopped and Google Duplex handles the conversations in this new domain autonomously.
Google Duplex is still experimental. Nevertheless CNet had the privilege of testing it exclusively before the Google I / O and reports that Google intends to deploy it gradually. The goal is to offer an artificial intelligence assisting users throughout the day, without fail. That's why integration with Google Assistant should be smooth from next summer.
Given the difficulty and multiple nuances of the French language, it is unlikely that this service will land in France soon. Whether in English or French, we are waiting to test this feature or see it at work to really believe it.
Source:
paypal,facebook,yahoo,mail,google,maps,ebay,amazon,barcelone,realmadrid,netflix,craigslist,AliCarter,Liverpool,AlfieEvans,YankeesVsAngels,RonanFarrow,YeVsThePeople,MesotheliomaLawFirm,Donate,CarToCharity,California,Donate,Car,ForTaxCredit,DonateCarsInMa;Insurance,Loans,Mortgage,Attorney,Credit,Lawyer,Donate,Degree,Hosting,Claimcashfear,softwares,money,football,SPORTNEWS,cars,carrental,cellphone,phonenumber,forex,torrent,voip,net,adsence,tollsspeakers,tipsspeakers,iphonespeakers,phones,iphone4,facebook,youtube,twitter,livematch,newslive,watchmatchforfree,watchlaligaforfree,watchserieAliveonjsc+,softwares,football,SPORTNEWS,cars,carrental,cellphone,phonenumber,forex,torrent,voip,net,adsence,tollsspeakers,tipsspeakers,iphonespeakers,phones,iphone4,facebook,youtube,twitter,livematch,newslive,watch match for free,watch laliga for free,watch serie A live on jsc+,windows 7,windows 8
Commentaires
Enregistrer un commentaire