Device Type: desktop
Skip to Main Content Skip to Main Content

Can You Hear Me Now? How Automatic Speech Recognition Can Transform the IVR Experience.

This article was published on June 9, 2020

Voice enabled smart devices have fundamentally changed the way we interact with the world around us. Instead of flipping on a lightswitch, we can now tell Alexa (or Google, or Siri, or some other AI-enabled virtual assistant ) that it’s time to wake up. Rather than fiddling with a thermostat, we can simply tell a device what temperature we want our room to be. We can even ask a smart device to scour the internet for whatever question we may have without lifting a finger. These days we spend more time talking to our phones than we do talking on them.

Image of AI helping with speech recognition


The notable exception here is when we need to reach out to a business.   A company's website might be the first stop for many when we have a general question, but what happens when we need to get some information about our bill, inquire about a package status, or have an issue with our service?  Our first inclination is not to rely on a smart device, but instead personally reach out for answers. We pick up our phones, call that generic toll-free number and are seemingly always greeted by that familiar sound - an automated voice rattling off a series of potential routes to service and asking us to use our keypad to get to the person we are hoping to speak with.  Press 1 for this, select 2 for that, and on and on it goes.  Considering this is one of the few times we’re actually using our phones to call someone, why shouldn’t we be able to just use our voice

Well the short answer is, you should be able to, and that is why Vonage has added Automatic Speech Recognition (ASR)  to its Voice API. 

Automatic Speech Recognition enables you to recognize and analyze everyday speech and utterances, so you can process and act upon it just by using a voice bot. It helps facilitate two-way conversations in a multitude of languages. While traditional IVR systems are the most common use cases for ASR, it can also be leveraged for delivery services (think ordering a pizza) and act as a voice assistant for low-complexity tasks and inquiries (think voice-enabled FAQs).

Using Automatic Speech Recognition, you can now enable IVR (interactive voice response) experiences that allow customers to choose between using DTMF or good old fashioned dial inputs using their natural voice.  Talking to support no longer has to mean pressing a series of number keys, you can get there by simply stating you want to talk to support.  And the best part is, depending on the nature of the call, using ASR a customer can have their inquiry resolved without ever having to bring a human into the loop to address  those low-complexity tasks.  From ordering a package to getting the status of a package, ASR is specifically built to handle these requests, so there is no need to wait to talk to a person. You can even use it to help a customer change an appointment time without a moment's delay; guess you won’t need to pipe in generic hold music anymore.

But what if you just need to talk to a representative?  Well that handoff can be facilitated as well.  The beauty of adding ASR to your IVR experience is the flexibility it affords to the customers.  Customers can use their voice or their keypad, managing tasks on their own if they prefer not to wait.  And when they do need to talk to a live person, they can experience shorter wait times with agents having been freed from having to handle every call that comes in regardless of the complexity.  Your Voice Enabled IVR can serve as the front line allowing agents to focus on what they do best; delivering a great customer experience that separates your brand from the competition.  

And speaking of customer experience, you might be thinking, “I’ve used these voice IVRs before, they’re not great and don’t recognize what I say.”  That’s true, not all speech recognition engines can recognize varying degrees of speech, which is why we built one that comes with a little help in the form of context.  Context is a parameter in our code that lets you seed the system with possible answers and variations, making it easier for the speech engine to recognize what the customer says and respond in turn.

And, if you are worried it takes too long for the IVR to get to the point before a customer can respond, that’s a challenge we made sure to address. Utilizing our Barge-In feature, the bot audio is cut when the customer starts speaking; so a customer can get to what they need faster. 

Lastly, when considering if adding speech recognition to your IVR is worthwhile it's important to note that this feature isn’t just about navigation.  It allows you to capture richer data and information than can be transferred using only DTMF.  Confirming addresses, taking complex orders, getting email and other contact information all of this and more can be captured using ASR.

To learn more about everything from ASR to IVRs, reach out to us or jump in and start building today.


Written by Vonage Staff

Deskphone with Vonage logo

Speak with an expert.

US toll-free number: 1-844-365-9460
Outside the US: Local Numbers