Can Emotional Analytics Help Businesses Get in Tune with How their Customers Are Really Feeling?
“I’ll be your savior, steadfast and true…I’ll come to your emotional rescue.” - The Rolling Stones
Anger. Fear. Disgust. Happiness. Sadness. Surprise. Contempt. These are defined as the seven universal emotions (some lists have 10, 21 or even more, but we’ll stick to the basics.) While even the most inexperienced agent can detect some emotions in a phone interaction and a more seasoned rep can perhaps interpret more subtle responses, there is often a sizable gap between what front-line personnel hear and what a customer is feeling. Since emotions drive behavior, this lack of understanding can negatively impact the bottom line, via increased customer churn, diminished loyalty and missed sales opportunities.
As is the case in many other imperfect business processes, developers are riding in on their technological white horses to become the knights in shining armor who can rectify the problem. Measuring customer sentiment through emotion recognition is taking center stage as a method of managing the customer experience. Emotion detection or emotion analytics, as it is more widely known, is a relatively recent but burgeoning field analyzes people’s verbal and non-verbal communication to better comprehend their mood or attitude. The resulting data can be used to calibrate CX in identifying how a customer perceives a company and its a products and services in interactions with a company representative.
Emotion analytics solutions have the capability to extract insights from all customer touchpoints and channels across the organization which include: calls, texts, video, facial, emails, chats, and social media platforms. These products employ historical data and real-time information to identify customer patterns and trends, providing the background for an agent to tweak the dialogue over the course of the call. Data and real-time information help companies to determine the right offers to generate to retain customers, thereby reducing escalations and churn. This makes emotion analytics a weapon of mass instruction for businesses seeking to gain an edge by learning what makes their customers tick.
According to Markets and Markets, the global market is expected to grow from USD 2.2 billion in 2019 to USD 4.6 billion by 2024, at a Compound Annual Growth Rate (CAGR) of 15.8% during 2019–2024. Need for higher customer satisfaction, rising significance for real-time emotion analytics, adherence to regulatory and compliance standards, and increasing need of emotion analytics software and services to cater to the growing Business Process Outsourcing (BPO) sector are the major factors driving the market.
Alexandros Potamianos, Co-founder and CTO of emotional analytics technology provider Behavioral Signals believes that this technology will be adopted over the next few years by the major players. “Even with its imperfections, speech-to-text transcription was an important breakthrough. We believe that now everything associated with the tone of voice and how a customer speaks will be the next big breakthrough. It will help businesses better understand intent; not just what the customer is saying but what he really means.”
Potamianos believes that the biggest challenge will be continuing to use emotional intelligence to bridge what he terms “the semantic gap” between what machines and humans can comprehend. “Right now, machines are pretty good at understanding what is being said but we are striving to improve their capability to determine what the user really wants. They need to understand the tone of the voice and the emotions behind it, the personality of the user.” He sees this going hand in hand with machines being able to generate the appropriate reply and converse with the user to meet their needs.
One technology sector that has a head start in emotion recognition is speech analytics, which introduced speech-to-text transcription. “We’ve done a great deal of research around capturing emotion in conversation,” noted Jeff Gallino, CTO and Founder, of long-time speech analytics leader CallMiner. “We firmly believe that emotional expression goes beyond simply the words being said but is a combination of factors. While there is no good emotional measurement without the underlying context of the words, there is so much more rich communication when you can overlay the tonality that accompanies them. Sentiment …which we’ve been measuring for a pretty long time…. is the verbiage of emotion for us. It helps you to understand how a customer feels about a product, service or interaction. It enables an enterprise company to coach an agent to better respond to those needs.”
“A classic example is if a customer is asking about a product and their description is negative,” continued Gallino. “He says ‘You know, I really need a phone but while I don’t like those things, I need this functionality.’ The initial part of the statement is a sentiment on top of a product. The way the customer on the phone benefits is that we can train the agent on how to show empathy, on how to better respond, perhaps by providing education. There are a lot of different kinds of responses that can be used for that statement. The agent could have the same kind of call the very next day where the customer is enthusiastic: “I love those things.’ The agent will treat that caller differently, which has nothing to do with the product. It’s that by measuring the sentiment expressed allows us to better serve that particular customer.”
Gallino sees an added benefit in emotion analytics in that training the agent on how to respond more effectively makes them happier, more comfortable and less likely to leave their positions. “People tend to forget the agent on the other side of the call is a human being as well,” he said. “They are not automatons. If you’re getting yelled at all day, it’s important to teach coping skills. Not only do sentiment words matter in what are often highly confrontational conversations, it’s how the agent says them and how the customer responds.” He feels this is particularly important since voice continues to outpace other growing communication channels, such as digital, chat and email. “Consumers are starting to realize that companies now care more about the customer experience and they are contacting them more often to give businesses a chance to prove themselves.”
One company making headway in offering real-time conversational and behavioral guidance to agents is Cogito. It was created in 2007 by Dr. Alex “Sandy” Pentland, an MIT professor and one of the co-founders of the university’s Media Lab, in conjunction with the current Cogito CEO, Joshua Feast. (Pentland now serves as director of the Human Dynamics Lab at MIT). “Sandy had done a huge amount of research on human social signaling mechanisms: how people communicate and interpret each other’s behavior,” said Steve Kraus, Vice President, Marketing for Cogito. “The company was founded on the idea that you could build technology that could capture these mechanisms, which he referred to as “honest” signals. These were based on measuring engagement - taking into account a person’s pitch and tone, behaviors such as mimicry and how a person is reacting, such as agreeing by saying ‘uh-huh, uh-huh’ repeatedly-- known as backchanneling-- or expressing disagreement in the conversation by overlapping. This led to developing the original software: a platform that not only captured the signals but could teach machines to understand them. “In the early days, it was much more clinical,” noted Kraus. “It was used in the healthcare industry to help understand human behavior in such conditions as bipolar disorder which gave it a strong foundation. Cogito was able to build up millions of data points on human behavior, through research that we able to do partially from funding through DARPA (Defense Advanced Research Projects Agency). Then after about five years, we were able to transition to doing work with clinicians and case workers who were on the phone with patients which led us into the idea that the platform would be a great tool for phone-based work. We were able to train the models on phone conversations.” The company expanded from handling healthcare calls to health insurance and have since moved into financial services, telco, retail and more. “You have to have great functionality, a solid business case and the right timing…now people have come around to the belief that AI can be the next wave and help companies do things they never did before.”
How does the technology work? According to CallMiner’s Gallino, there isn’t a one-size-fits-all approach to measuring emotion acoustically. “There are basic acoustic measurements we’ve used in our business for 15 years and we’re confident that works. Amount of silence, energy, overtalk, syllabic rate. These are the characteristics of speech but not necessarily the content of speech. In advanced acoustics, you can measure elements such as micro-tremor analysis, which shows stress, agitation and baseline changes. One of the things we’ve discovered in our research is that some people always sound upset and others always sound happy. What we’ve found is it’s not as important to measure the raw numbers as the changing numbers. When we’re measuring an unhappy customer who has made a call and compare their baseline from the first half to the back half and we see the numbers decrease as well as the content--the words the customer is using—moving toward more satisfactory behavior, we can determine that the conversation got better as it went along. Of course, it can go the other way where someone starts out calm and the baseline numbers double or triple, the opposite conclusion can be reached. In these cases, the words themselves become extremely important.” Gallino believes he can fool prosodic engines---another technique of measurement employing mathematical tonality by making negative comments with a smile on his face, but the words do matter. That’s why he feels this decades-old measurement --which he considers a rudimentary form of machine learning-- has largely been debunked. “Mathematically, it’s a coin flip whether prosodics is right.” He believes that a great number of examples are needed along with training the machine to accurately determine what it’s like when someone is upset, hesitant or happy. “It’s teaching the machine rather than trying to program it. We’ve found all of the techniques have value and it is not beneficial to use just one method as some companies do in isolating their solutions on strictly the machine learning aspect.”
Behavioral Signals’ Potamianos feels that a lot of the progress in emotional analytics is being fueled by improvements in Natural Language Processing (NLP). “The main challenge is being able to understand in context: Google and Amazon Alexa are at the forefront of this arena, competing to offer more meaningful conversations with personal assistants as well as understanding other information about the world we live in.” He believes the next breakthrough in this area will come sometime over the next three to five years, providing “the ability to understand emotions, personality, social roles…comprehending the fine nuances of conversation. Ultimately, we will have machines that have emotional and social intelligence.” Potamianos believes that it’s not good enough to have just the emotion detection part of the technology but be able to model the dynamics so the machine can respond in an emotionally and socially appropriate fashion but admits that is further up the road. He sees today’s bot technology as very similar to conversations with Alexa or Google, which is very limited and very specific to solving common problems. “We’re going to see bots that can respond to more than the low-end contact center tasks they are currently handling evolving to more complex conversations and developing some emotional AI…an ability to understand or actually empathize with the user. We will see capabilities…sometimes superhuman capabilities…in detecting intent or disposition or satisfaction.” He feels that is important for this technology to be thoroughly perfected before being rolled out to avoid the risk of creating negative user experiences.
In many respects Behavioral Signals process is not unlike that of a traditional speech analytics product…it records audio, breaks it up into chunks to see what is being said when and pass it through a voice activity detector to determine where the voice of the agent and the customer will fit, and then give the chunks of audio to an emotional learning platform that will provide information about the speaking style of both parties as well as emotional behaviors: is the customer angry? Is the agent responding appropriately? Is someone being impolite? Are the customer and agent engaged? According to Potamianos, the text outputs will be not just verbatim words, but emotions and behaviors that can be used to shape KPIs that are relevant for contact centers, improve performance and determine how well the agent and the customer are matching. “A big part of it is how you train the model,” he said. “And these models have been trained to be as universal as possible. The big trick is selecting and collecting the most appropriate data, labelling it and using it to train the models.” Behavioral Signals’ main thrust at this point is to integrate with speech analytics solutions. Most of their customer base is what Potamianos describes as “major players” in that space using their API to build alerts for agents to determine how they are doing on a call.
Cogito’s Kraus stresses that the ultimate benefit the company’s solution delivers is using AI to augment contact center personnel. “We have a closed-loop system that enables us to measure how well the conversation is going in real time, what the deep behaviors are that are happening in it,” he said “And then in the call, we’ll give a nudge to the agent at the appropriate time: tell them they are perhaps speaking too quickly or lack energy or the customer is getting emotional. This is intended to change the behavior of the agent within the call. We can measure within our closed-loop system to see if it is having the desired effect and use the data to continue to train the model. The more we deploy, the better we can get.” He also noted that the results generated by the machines are reviewed by an internal human annotators team backed by a team of behavioral scientists, which includes PhDs and psychologists. The company can track results within the contact center, to correlate outcomes to behaviors and see if they are positively impacting handle times, first contact resolution or close rates in sales environments. While benefits multiply over time as more is learned, Kraus maintains the solution’s innate intelligence makes it valuable right out of the box. He also acknowledged that emotional analytics is far more helpful to a business if they have the right contextual information accessible to the agent.
Potamianos sees emotional analytics as a “somewhat disruptive” technology that still might not appeal to more traditional companies, but is of interest to younger, more agile businesses. “It’s still not at the top of the list for older companies; they have their own solutions.” He sees the technology as a work in progress. Emotions are hard to detect. He estimates the technology is still 10 to 20% less accurate than a human in detecting emotions with a goal of equaling it within nine months. Kraus questions whether machines will ever be able to do as good a job as humans in that area. “Humans can’t agree on emotions: one person might not see what I view as happiness as actually being happiness. Machines might get better at detecting the proxies of emotion, but I don’t think they will ever get smart enough to determine emotions as well as we are. While solutions are getting better, we still rely on the human in the loop to take the information and make a final judgment. We deliver the guidance to avoid negative behavior by agents, but they still have to take the right actions on their own.”
Thank you to Behavior Signals, CallMiner, and Cogito for their responses.
Guest blog by Sheri Greenhaus, CrmXchange, https://www.crmxchange.com