Siri works detailed solution: local speech recognition + cloud computing services
Introduction: ZDNet Web site published today says Apple voice Assistant service Siri stylish and smart, are also useful in some cases. So how does Siri work? “Speech recognition” is the core of Siri, but that does not explain Siri understands detailed procedures that the user content. This week the industry author, explains the work steps of Siri.
Following is the text of the article:
After you talk to iPhone users, the voice is immediately encoded into a compressed digital file that contains all the useful information. This information through your Internet service provider (ISP) network is sent to the cloud server, and the server will identify the module in content that the user said.
Meanwhile, the user’s voice in the mobile phone side to be identified. Speech recognizer installed on the mobile phone will work with cloud computing, server communication, understand the instruction is suitable for processing in the local. User instructions may require cell phones to play a song, while other instructions you may want to phone to connect to the network for further help. If the speech recognizer believes that instructions inside module to handle the user’s mobile phone, it will inform cloud computing servers, eliminating the need for server support.
Based on the user’s tone and word order, the server will be static for the speech comparing, understand what letters are included in the speech. At the same time, local speech recognizer are on user voice for static contrast. On the server side and the mobile phone side, identification of content with the highest likelihood will receive priority processing.
At this point, the identified content already contained a series of vowels and consonants. Then the content will be sent to a language module to assess what words are included in the user’s voice. Depending on the credibility, the computer will create a list that the user content.
If judging results have sufficient credibility, the computer will be able to understand what users said, for example, send text messages or find the contact in the contacts list. Then the user will see mobile appears on the screen the content you need, without having to do so manually. In this process, if a user of speech meaning too vague, the computer will ask the user, such as a user that you want to find contacts is ailika·aoersen (Erica Olssen) or ailika·shimite (Erica Schmidt). (Victoria gold)