google-speech | node.js module for Google speech systems | Speech library
kandi X-RAY | google-speech Summary
kandi X-RAY | google-speech Summary
node.js module for Google speech systems (ASR & TTS)
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of google-speech
google-speech Key Features
google-speech Examples and Code Snippets
Community Discussions
Trending Discussions on google-speech
QUESTION
I am using python3 to transcribe an audio file with Google speech-to-text via the provided python packages (google-speech).
There is an option to define custom phrases which should be used for transcription as stated in the docs: https://cloud.google.com/speech-to-text/docs/speech-adaptation
For testing purposes I am using a small audio file with the contained text:
[..] in this lecture we'll talk about the Burrows wheeler transform and the FM index [..]
And I am giving the following phrases to see the effects if for example I want a specific name to be recognized with the correct notation. In this example I want to change burrows to barrows:
...ANSWER
Answered 2021-Nov-29 at 16:18I have created an audio file to recreate your scenario and I was able to improve the recognition using the model adaptation. To achieve this with this feature, I would suggest taking a look at this example and this post to better understand the adaptation model.
Now, to improve the recognition of your phrase, I performed the following:
- I created a new audio file using the following page with the mentioned phrase.
in this lecture we'll talk about the Burrows wheeler transform and the FM index
- My tests were based on this code sample. This code creates a
PhraseSet
andCustomClass
that includes the word you would like to improve, in this case the word "barrows". You can also create/update/delete the phrase set and custom class using the Speech-To-Text GUI. Below is the code I used for the improvement.
QUESTION
In my project I am developing a websocket server in golang which is controlling asterisk channel via ARI and performing live audio transcription on the same channel with google-speech-api. On connection I want to save audio from an asterisk channel to file while simultanously sending audio to google and get a transcript. Audio is being sent by asterisk audiofork app so that I can manipulate channel with ARI while audio is streamed on another thread.
The problem is that when I send frames to google I get EOF error on the first and every consecutive frame I send from my server but when I convert binary file with saved frames to wav using sox I get recording of audio on the channel so the frames send by audiofork aren't corrupted. Can anyone give me any advice how to make google-speech-api cooperate with me?
...ANSWER
Answered 2021-Oct-07 at 08:05This is pretty embarrasing but this was just dumb mistake on my side because in my function creating google client I put closing client on defer so when my function retuned speech client variable it was automatically closed by the same function so after fixing that everything works as intented.
QUESTION
I'm using Google Cloud Text-to-speech to synthesize speech from text. How can I specify the region for the API calls? This is similar to this question Specify Region for Google Speech API? but my question is for text-to-speech, not speech-to-text.
For speech-to-text there wasn't an available endpoint in Europe but there is one now: https://cloud.google.com/speech-to-text/docs/endpoints
I can't find the same type of endpoint documentation for text-to-speech, the closest I find is this page: https://cloud.google.com/text-to-speech/docs/reference/rest that specifies a single endpoint: https://texttospeech.googleapis.com
Does this mean that I cannot keep the text-to-speech requests within Europe? It could also be that the region is fetched from the Google Cloud project region or something like that but I cannot find such an option.
...ANSWER
Answered 2021-May-07 at 07:01The specifying of region for text-to-speech API that you are asking for is currently a requested feature in GCP public issue tracker. You can track the progress with this link.
QUESTION
I'm trying to transcribe a German podcast which I have both on my pc and my Google Storage bucket. I'm using this tutorial as a reference.
Here's my code:
...ANSWER
Answered 2020-Nov-04 at 09:49Have you tried this:
QUESTION
I've been pretty stuck on this problem for a few days now, praying that someone will be able to point me in the right direction.
I have a stream of Opus buffers as encoded by https://github.com/discordjs/opus
I want to send these to the google speech to text api which require them to be encapsulated in ogg containers: https://cloud.google.com/speech-to-text/docs/reference/rpc/google.cloud.speech.v1#audioencoding
I'm trying to use this library: https://github.com/TooTallNate/node-ogg
Here is what I'm trying:
...ANSWER
Answered 2020-Oct-02 at 16:45As above the answer was that the ogg package is expecting ogg_packets and @discordjs/opus does not give that.
QUESTION
I'm trying to execute google-speech-to-text from apps script. Unfortunately, I cannot find any examples for apps script or pure HTTP, so I can run it using simple UrlFetchApp.
I created a service account and setup a project with enabled speech-to-text api, and was able to successfully run recognition using command-line example
curl -s -H "Content-Type: application/json" \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ https://speech.googleapis.com/v1/speech:recognize \ -d @sync-request.json
which I can easily translate to UrlFetchApp call, but I don't have an idea to generate access token created by
gcloud auth application-default print-access-token
Is there a way to get it from apps script using service account credentials?
Or is there any other way to auth and access speech-to-text from apps script?
...ANSWER
Answered 2020-Apr-27 at 19:41The equivalent of retrieving access tokens through service accounts is through the apps script oauth library. The library handles creation of the JWT token.
Sample here
QUESTION
I have a need to do some real time transcriptions from twilio phone calls using Google speech-to-text api and I've followed a few demo apps showing how to set this up. My application is in .net core 3.1 and I am using webhooks with a Twilio defined callback method. Upon retrieving the media from Twilio through the callback it is passed as Raw audio in encoded in base64 as you can see here.
https://www.twilio.com/docs/voice/twiml/stream
I've referenced this demo on Live Transcribing as well and am trying to mimic the case statement in the c#. Everything connects correctly and the media and payload is passed into my app just fine from Twilio.
The audio string is then converted to a byte[] to pass to the Task that needs to transcribe the audio
...ANSWER
Answered 2020-Apr-24 at 21:57After all this, I discovered that this code works fine, just needs to be broken up and called in different events in the Twilio stream lifecycle. The config section needs to be placed during the connected event. The print messages task needs to be placed in the media event. Then, the WriteCompleteAsync needs to be placed in the stop event when the websocket is closed from Twilio.
One other important item to consider are the number of requests being sent to Google STT to ensure that too many requests aren't overloading the quota which seems to be (for now) 300 requests / minute.
QUESTION
I have a problem when trying to generate a release version of my app. It gives a strange error
...ANSWER
Answered 2020-Apr-22 at 20:58Answering my own question because it turned out to be an R8
bug and after me reporting it, they solved the issue. Which is great.
Full bug report and how to apply fix is here
Short version:
change gradle
configuration to this
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install google-speech
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page