Automation has progressed quickly from a hot new QA topic to a must-have for ambitious teams. Customer expectations are evolving as fast as technology, and advances in AI and automation help your service team keep up.
So, let’s talk AutoQA and how you can be doing more with less.
What is AutoQA?
Manually reviewing conversations to find problem areas is like taking a metal detector to the beach for treasure. It’s pretty long and tiring, with no guarantee of a result. With Auto QA, you already have a map to find out where the treasure (or problem areas) are. When you can jump to them directly, you’ll have a lot more energy to strategize for improvement.
💡 Pay attention to how your QA software provider defines AutoQA, though. These days, most quality assurance tools offer automation, but not many can actually score your support conversation automatically.
Automated vs. manual QA
Let’s talk about the division of labor and define the roles of automation, AI, and human (hello you) in customer service quality assurance.
AutoQA reduces labor and helps you improve your customer service quality faster than you’ve ever been able to before.
However, manual reviews are still vital in the customer service QA process. The human in the loop (your QA specialist) is what brings to light the enhancements of automation and AI.
We can focus on quickly reaching a broad and international audience by using multilingual pre-trained large language models. This allows us to, for example, conduct sentiment analysis and named entity recognition. Through this, we added support for two new languages (Polish and Portuguese) in the space of just a few weeks.
Relying on language models enables AI to achieve things at a speed that seemed impossible just 5 years ago.
In addition to ML models, we focus on AI knowledge interpretability. Not everyone is trained in data interpretation – we know that! It’s our job to help our customers make sense of the vast data. We hide much of the complex AI insights in informative and concise dashboards and graphs.
The ‘human’ in the feedback loop is irreplaceable — digging into nuanced situations, constructing feedback sessions, in-depth analysis, etc. But the hardest part of the feedback loop is also human behavior.
For this reason, relying on manual reviews alone dices with risk:
- Limited scope
On average, only 2% of conversations are reviewed manually.
Reviewers may have unconscious biases that can affect their evaluations, leading to unfair assessments and decisions.
- Lack of scalability
Manual reviews are not easily scalable, especially as the volume of customer interactions grows, which can lead to delays in providing feedback and improving customer service.
- Human error
This is inevitable, and leads to inaccurate assessments and bad data.
Different reviewers may have different standards or interpretations of what constitutes good customer service.
Manual reviews require a lot of time and effort, more so as your ticket volume swells. This, of course, has a knock-on effect on your bottom line.
There is no absolute binary between automated and manual QA. With the right software, you can master both.
The power of the bigger picture
Support teams often handle hundreds of conversations daily. A large proportion of your conversations are very straightforward — they just don’t contain enough nuance to dive into manually.
For example, one in which there is a quick back-and-forth between agent and customer, solving a common problem, tells you nothing new about performance and processes. Highlighting a conversation like this for review, and rating it on a scorecard with ten categories, is a waste of time.
At Klaus, we understand that businesses may have specific requirements and nuances that they want to measure within each category. While our solution may not capture every single detail of what a company wants to measure in each category, we’ve designed it to be low-risk and non-penalizing.
We focus on accurately capturing what is present – what we know how to catch. We do not penalize agents or reduce their scores for something our system may not have accurately captured.
Auto QA means full coverage for analyzing trends. It also means that you can increase your reviewing capacity by 50x and cover all your QA bases for every single one of your conversations.
Without having to open a single interaction, you can understand the breadth of both overall customer sentiment and team performance — regardless of ticket volume.
- Every support interaction can be processed by Klaus’ proprietary ML engine for instant understanding of your support landscape.
- Achieve 100% coverage by automatically scoring every agent and support interaction across multiple categories and languages.
- Acts less like an assistant and more like a coworker. There is no model training required with our plug & play solution. Simply step in and get to work. Read more about how Klaus AutoQA works.
Finding the signal among the noise
Getting every single item into the system is step one.
There are still golden tickets of caramel goodness for quality improvement purposes that require human eyes. So it’s important not to let them stay mired in the status quo.
It is crucial that QA specialists are at hand to scrutinize certain conversations for several reasons:
- They help identify specific areas of improvement and facilitate targeted training or support to help agents improve performance.
- They can better analyze customer sentiment and address issues that are negatively impacting customer satisfaction – whether those problems lie in customer service or need to be communicated to other departments.
- QA specialists can also identify potential risks or compliance issues in customer interactions.
- They also deliver constructive feedback using AutoQA as evidence of performance.
Spotlight is a unique conversation discovery feature that automatically samples the conversations critical for review. These are selected through a multilevel statistical analysis of your own communications metadata. In other words, based on criteria AI has customized for your support team.
By reviewing conversations highlighted by Spotlight, you can ensure your QA efforts are focused on conversations that are critical to review and contain the most influential learning moments.
You definitely want to see context when, for example, emotions run high and the conversation has strayed from the original problem. A machine can tell you that this conversation went longer than average. But that could mean the agent did not control the conversation, or, on the contrary, that they managed to calm down an upset customer and bring them back onto a path to resolution.
Let the machine spot the conversations that need human review, and then let a human reviewer give feedback that makes sense.
The scalability of combining automated and manual reviews
While manual reviews can dive deep into nuance, they are also time-consuming and not easily scalable.
On the other hand, automated reviews can analyze a large volume of interactions quickly, but may not be able to capture the intricacies and subtleties of human communication.
We have officially entered the era in which humans work best when alongside machines (and, really, vice versa is true also). There are 5 key principles that, when followed, aid optimal human-machine collaboration:
- Reimagining business processes
- Embracing experimentation and employee involvement
- Actively directing an AI strategy
- Responsibly collecting data
- Redesigning work to incorporate AI and cultivate employee skills.
AutoQA taps into each and every one of these principles when used by a QA specialist with the eye to pluck out critical conversations and knowledge to put the ensuing analytical data into action.
By combining both methods, companies can leverage the benefits of both approaches. Automated reviews can quickly identify issues, while manual reviews can provide a more detailed analysis of interactions and identify areas for improvement. This approach provides an efficient and scalable way to ensure high-quality customer service across a large volume of interactions.
And ultimately, an easier, smarter way to keep customers happy as you scale.
Auto-scoring: What AutoQA categories are there?
It’s time to educate you on the nuances of AutoQA. Remember — not every QA tool can score support conversations automatically. In some cases, AutoQA automatically generates conversation snippets to give you an overview of what’s happened or comes down to customer sentiment analysis.
Broken down by category, let’s get into what automated scorecard looks like in Klaus, and how the score is reached.
In customer support, providing a satisfactory solution is the ultimate goal—a definitive answer to the customer’s query or concern.
It is, therefore, one of the most popular scorecard categories.
💡 60% of customer service teams score conversations by “Solution”.
How is ‘Solution’ scored by Auto QA:
- In our AutoQA, the ‘Solution’ category assesses whether an agent offered a solution during the conversation (surprise, surprise).
- This employs a binary scale. It boils down to a straightforward question:
Did the agent propose a solution to address the customer’s issue?
Yes = 👍
Unclear or inapplicable = NA
- It’s crucial to understand that the AutoQA ‘Solution’ category identifies whether a solution was offered by the agent, but does not discern whether it was the correct solution. (However, support agents usually do give correct advice, so it’s a very worthwhile indicator!)
- You always have the option to override AutoQA scores manually. If, for example, a solution was offered but it didn’t happen to align by internal standards.
Communication is not just about words; it’s also about how those words are delivered. Tone plays a pivotal role in shaping the emotional texture of a conversation.
💡 60% of customer service teams score conversations by “Tone”.
How is ‘Tone’ scored by Auto QA:
- Our evaluation of ‘Tone’ understands that a conversation can be a tapestry of emotions, with various tones interwoven throughout. Each tone is assigned a different weight, where positive tones contribute positively and negative tones affect the score accordingly. For example, ‘joyful’ carries a weight of 2.9, while ‘frustrated’ bears a weight of -3.
- The ‘Tone’ category currently recognizes 27 distinct tones, including concerned, optimistic, and apologetic, among others. Each tone contributes its unique emotional flavor to the conversation.
- The scoring system falls on a 5-point scale. This scale allows us to gauge the emotional resonance of the conversation, providing valuable insights into the customer-agent interaction.
Empathy in customer support goes beyond just resolving issues—it’s about understanding and showing genuine concern for customers’ feelings and problems.
💡 47% of customer service teams score conversations by ‘Empathy’.
How is ‘Empathy’ scored by Auto QA:
- Our ‘Empathy’ category is designed to assess a fundamental question: “Was the Agent empathetic towards the customer and their problems?”
- Empathy assessment using the Klaus method is translated into rating scales: Empathetic behavior found = 👍 Empathy behavior not found = N/A
- Empathy assessment using the ChatGPT method is translated as a binary:
Empathetic behavior found = 👍
Empathetic behavior not found = 👎
Undetected = N/A
Spelling and grammar
In customer support, clear communication is paramount. Our ‘Spelling & Grammar’ category is designed to ensure that all written interactions meet high language standards. This category detects various types of language errors, which are grouped into three categories: grammar mistakes, misspelling mistakes, and style mistakes.
💡 60% of customer service teams score conversations according to ‘Spelling and Grammar.’
How is ‘Spelling and Grammar’ scored by Auto QA:
- For each conversation, we aggregate scores for the entire conversation and the agents involved. A weighted score for mistakes is calculated, considering various types of grammar mistakes.
- The final score is calculated on a scale of 1-5 for both the conversation and individual agents. While the scores are saved in the database as 1-5, they can be easily converted to different rating scales, such as binary or 3/4 scales, as needed.
The initial moments of a customer interaction often set the stage for the entire experience. A warm and welcoming greeting can make customers feel valued and understood right from the start.
💡 7% of customer service teams score conversations according to ‘Greeting’.
How is ‘Greeting’ scored by Auto QA:
- Our Greeting category is designed to assess whether the agent successfully initiated the conversation with a friendly and professional opening.
- The Greeting category utilizes a binary scale, where the focus lies on whether the agent greeted the customer:
Greeting found = 👍
No greeting found = 👎
How you end a conversation matters. Our ‘Closing’ category is your tool to ensure that your agents conclude customer interactions on a high note.
💡 7% of customer service teams rate conversations according to ‘Closing’.
How is ‘Closing’ scored by Auto QA:
- AutoQA takes into account the two elements of an effective close:
- Goodbye: The way agents bid farewell impacts how customers perceive the interaction’s conclusion. Using appropriate phrases like bye or farewell leaves a positive impression.
- Future Contact Encouragement: Encouraging customers to reach out again is essential for building lasting relationships. Phrases like ‘Feel free to reach out’ can be powerful in this context.
Our goal is to continually expand our repertoire of automatic categories to empower your customer support teams further. Auto QA is here to bridge the gap between technology and human-centric service, ensuring that your customers always receive the best possible support.
As I type, our lab-cats are working on adding many, many more categories, to offer you a very full automated scorecard.