Cloud Speech API Demo

Cloud Speech API Demo

November 18, 2019 8 By Peter Engel


The best way
to experience the Speech API is with a demo.
Before I get into it, I want to explain a bit
how it works. So we’re going to make
a recording, and I wrote a Bash script. We’re gonna use SoX to do that. It’s a command line utility
for audio. So what we’ll do
is we’ll record our audio, we’ll create an API request
in a JSON file, we’ll send it to the Speech API, and then we’ll see
the JSON response. So if we could go ahead
and switch to the demo… Okay, let me make the font
a little bigger. So I’m gonna call my file
“if Bash requests.sh.” And it says, “Press enter
when you’re ready to record.” It’s gonna ask me to record
a five-second audio file. So here we go. I built a Cloud Speech API demo
using SoX. Okay, so this is
the JSON request file that is just created. We need to tell the Speech API
the encoding type. In this case,
we’re using FLAC encoding. The sample rate in hertz. The language code is optional. If you leave it out,
it’ll default to English. Otherwise, you need
to tell it what language your audio is in. And then the “speech context,”
I’m gonna talk about that in a little bit. So I’m gonna call
the Speech API. It’s making a cURL request. And let’s see how it did. Okay, so it did pretty good.
It said, “I built a Cloud Speech API demo
using socks,” but you’ll notice it got
the wrong “socks,” because SoX is a proper noun. It was 89% confident
that it got this correct. I even was able to get “API”
as an acronym. So I mentioned a speech context
parameter before. And what this
actually lets you do is, let’s say you have a proper noun or a word that you’re expecting
in your application that’s unique
that you wouldn’t expect the API to recognize normally. You can actually pass it
as a parameter and it’ll look out
for that word. So I’m gonna hop on over
to Sublime, and I’m gonna add “SoX”
as a phrase to look out for. And let’s see if it’s able
to identify it. I’m gonna say
the same thing again. I’m gonna record. I built a Cloud Speech API demo
using SoX. And we can see it’s now got
that phrase in there. And we will call the Speech API. And it was able
to get it correctly using the phrases parameter,
which is pretty cool. Just one REST API request,
and we are easily transcribing an audio file
even with a unique entity. And you can also pass
the API audio files in over 80 different languages. You just need to tell it, again,
the language code that you’d like it
to transcribe. So that is the Speech API.