It can feel at times like we live in a science fiction future. We hold the whole of human knowledge in palm-sized devices that are constantly connected to the Internet. We speak to our computers and they respond with seemingly intelligent feedback.
But while the hardware that powers our lives has advanced at rapid speed in the last three decades, voice assistant technology still relies heavily on the same human input that traditional software programs have for much of that time.
But for Alexa to become a true digital assistant, the platform – and those like it in other devices – needs to be more proactive. This is where behavioral intent prediction comes in, utilizing machine learning to evaluate and predict user behaviors based on thousands of inputs.
The result will be a much more human-like interaction – with a device that can predict what a consumer needs and when they need it, much the same as a human assistant.
For companies, this will lead to a boom in data insights that further enhance targeting, personalization, and the likelihood of a sale.
What is Behavioral Intent Prediction?
The ability to predict the behavior of a consumer is a holy grail to many corporations. Billions of dollars are spent annually on market research, behavioral analysis, and new technologies to deliver smarter advertising to users. So, it’s no wonder we are starting to see an increase in the sophistication of our voice-activated devices – these are consumer applications after all.
Fast forward nearly forty years and developers are using a similar approach to “teach” VUI’s like Alexa and Siri to learn more about their users and respond in kind. Of course, there are many challenges to successfully doing this as well.
The sheer volume of data that needs to be collected, cataloged and labeled before it can be input into the system is extensive. Amazon, for example, spends a considerable amount of money having thousands of hours of audio annotated each day to help the system better understand key elements based on the content of user speech.
How Voice Interfaces Learn to Read Intent
One of the biggest challenges with voice systems in their early iterations was how specific you needed to be. Everyone has attempted to trigger a command with their phone or Echo device and found that they did not use the right combination of words to trigger the action.
These devices have been improved substantially and now attempt to determine, from context, what the user is asking, even if the specific language that triggers a skill is not used. Colloquialisms and variations on questions allow users to ask, “What’s it like outside?” or “Should I wear my coat?” instead of specific inquiries like “What is the weather in Chicago today?”
For homes that have sensors installed or for more advanced applications that interface with IoT devices, this allows for some creative implementations of voice control.
For example, you might say “Alexa, play some music,” and the device would be able to intuit which connected device to play the music on, and what volume to set based on the time of day and the typical volume you choose.
The number of times that users have to reframe questions repeatedly to get the response they expect and desire is decreasing as these interfaces get smarter and better able to evaluate intent and respond in kind.
Predictive analytics go well beyond what a company might see in a survey or market research study and analyzes every element of a phone call, VUI interaction, or other recorded discussion. This allows developers and marketers alike to evaluate the root cause of a conversation, why someone’s mood changes during such a conversation (an invaluable resource in customer service), and much more. The result is a better user experience that caters itself to the user, and more actionable data for companies.
How Emotion AI Supplements Behavioral Intent Prediction?
One of the many barriers to a predictive model in voice assistant technology was the lack of context. Devices could hear commands and respond, and to some degree evaluate the specific words being spoken, but only with a more advanced approach to the context of the words and how they are spoken can the next step be taken.
Emotion AI is capable of evaluating several elements of the user beyond their words. For example, it can take into account the regional dialect of the user, the micro-cues that indicate a specific emotion that might influence how and why they are saying something.
On a small scale, Alexa is now able to recognize when someone is whispering and whisper back – a godsend for parents trying to check the time or the weather while holding a sleeping baby.
Now imagine when the system could anticipate a mood entirely based on how something was said and respond accordingly, not only with the right content but in a way that is catered to those emotions.
Behavioral Intent Prediciton is only one example of an emerging technology that is bridging the gap between human and machine. Advancements such as these are inevitable based on the current trajectory of implementation and discovery.
Rather than be fearful or doubtful of their abilities, take this as an opportunity to explore and engage in the understanding of how technology will advance humankind.