google-speech | node.js module for Google speech systems | Speech library

by antirek JavaScript Version: 0.0.5 License: MIT

X-Ray Key Features Code Snippets Community Discussions(8)Vulnerabilities Install Support

kandi X-RAY | google-speech Summary

google-speech is a JavaScript library typically used in Artificial Intelligence, Speech applications. google-speech has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. You can install using 'npm i google-speech' or download it from GitHub, npm.

node.js module for Google speech systems (ASR & TTS)

Support

Quality

Security

License

Reuse

Support

google-speech has a low active ecosystem.

It has 23 star(s) with 4 fork(s). There are 8 watchers for this library.

It had no major release in the last 12 months.

There are 1 open issues and 2 have been closed. On average issues are closed in 0 days. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of google-speech is 0.0.5

Quality

google-speech has 0 bugs and 0 code smells.

Security

google-speech has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

google-speech code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

google-speech is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

google-speech releases are not available. You will need to build from source code and install.

Deployable package is available in npm.

Installation instructions are not available. Examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of google-speech

Get all kandi verified functions for this library.

google-speech Key Features

No Key Features are available at this moment for google-speech.

google-speech Examples and Code Snippets

No Code Snippets are available at this moment for google-speech.

Community Discussions

Trending Discussions on google-speech

Custom phrases/words are ignored by Google Speech-To-Text

Google-speech-api throws EOF error instead of performing audio transcription

Google Text-to-speech - How to specify region EU?

Problems with Google Cloud Speech-to-Text API

Creating an Ogg packet from Opus buffers in nodejs

How can I authorize Google Speech-to-text from Google Apps script?

Twilio Base64 Media Payload for Google Speech To Text API not Responding

R8: NullPointerException during IR Conversion

QUESTION

Custom phrases/words are ignored by Google Speech-To-Text

Asked 2021-Nov-29 at 16:18

I am using python3 to transcribe an audio file with Google speech-to-text via the provided python packages (google-speech).

There is an option to define custom phrases which should be used for transcription as stated in the docs: https://cloud.google.com/speech-to-text/docs/speech-adaptation

For testing purposes I am using a small audio file with the contained text:

[..] in this lecture we'll talk about the Burrows wheeler transform and the FM index [..]

And I am giving the following phrases to see the effects if for example I want a specific name to be recognized with the correct notation. In this example I want to change burrows to barrows:

...

ANSWER

Answered 2021-Nov-29 at 16:18

I have created an audio file to recreate your scenario and I was able to improve the recognition using the model adaptation. To achieve this with this feature, I would suggest taking a look at this example and this post to better understand the adaptation model.

Now, to improve the recognition of your phrase, I performed the following:

I created a new audio file using the following page with the mentioned phrase.

in this lecture we'll talk about the Burrows wheeler transform and the FM index

My tests were based on this code sample. This code creates a PhraseSet and CustomClass that includes the word you would like to improve, in this case the word "barrows". You can also create/update/delete the phrase set and custom class using the Speech-To-Text GUI. Below is the code I used for the improvement.

Source https://stackoverflow.com/questions/70048973

QUESTION

Google-speech-api throws EOF error instead of performing audio transcription

Asked 2021-Oct-07 at 08:05

In my project I am developing a websocket server in golang which is controlling asterisk channel via ARI and performing live audio transcription on the same channel with google-speech-api. On connection I want to save audio from an asterisk channel to file while simultanously sending audio to google and get a transcript. Audio is being sent by asterisk audiofork app so that I can manipulate channel with ARI while audio is streamed on another thread.

The problem is that when I send frames to google I get EOF error on the first and every consecutive frame I send from my server but when I convert binary file with saved frames to wav using sox I get recording of audio on the channel so the frames send by audiofork aren't corrupted. Can anyone give me any advice how to make google-speech-api cooperate with me?

...

ANSWER

Answered 2021-Oct-07 at 08:05

This is pretty embarrasing but this was just dumb mistake on my side because in my function creating google client I put closing client on defer so when my function retuned speech client variable it was automatically closed by the same function so after fixing that everything works as intented.

Source https://stackoverflow.com/questions/69477052

QUESTION

Google Text-to-speech - How to specify region EU?

Asked 2021-May-07 at 07:01

I'm using Google Cloud Text-to-speech to synthesize speech from text. How can I specify the region for the API calls? This is similar to this question Specify Region for Google Speech API? but my question is for text-to-speech, not speech-to-text.

For speech-to-text there wasn't an available endpoint in Europe but there is one now: https://cloud.google.com/speech-to-text/docs/endpoints

I can't find the same type of endpoint documentation for text-to-speech, the closest I find is this page: https://cloud.google.com/text-to-speech/docs/reference/rest that specifies a single endpoint: https://texttospeech.googleapis.com

Does this mean that I cannot keep the text-to-speech requests within Europe? It could also be that the region is fetched from the Google Cloud project region or something like that but I cannot find such an option.

...

ANSWER

Answered 2021-May-07 at 07:01

The specifying of region for text-to-speech API that you are asking for is currently a requested feature in GCP public issue tracker. You can track the progress with this link.

Source https://stackoverflow.com/questions/67397956

QUESTION

Problems with Google Cloud Speech-to-Text API

Asked 2020-Nov-04 at 09:59

I'm trying to transcribe a German podcast which I have both on my pc and my Google Storage bucket. I'm using this tutorial as a reference.

Here's my code:

...

ANSWER

Answered 2020-Nov-04 at 09:49

Have you tried this:

Source https://stackoverflow.com/questions/64677600

QUESTION

Creating an Ogg packet from Opus buffers in nodejs

Asked 2020-Oct-02 at 16:45

I've been pretty stuck on this problem for a few days now, praying that someone will be able to point me in the right direction.

I have a stream of Opus buffers as encoded by https://github.com/discordjs/opus

I want to send these to the google speech to text api which require them to be encapsulated in ogg containers: https://cloud.google.com/speech-to-text/docs/reference/rpc/google.cloud.speech.v1#audioencoding

I'm trying to use this library: https://github.com/TooTallNate/node-ogg

Here is what I'm trying:

...

ANSWER

Answered 2020-Oct-02 at 16:45

As above the answer was that the ogg package is expecting ogg_packets and @discordjs/opus does not give that.

Source https://stackoverflow.com/questions/64105686

QUESTION

How can I authorize Google Speech-to-text from Google Apps script?

Asked 2020-May-06 at 09:35

I'm trying to execute google-speech-to-text from apps script. Unfortunately, I cannot find any examples for apps script or pure HTTP, so I can run it using simple UrlFetchApp.

I created a service account and setup a project with enabled speech-to-text api, and was able to successfully run recognition using command-line example

curl -s -H "Content-Type: application/json" \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ https://speech.googleapis.com/v1/speech:recognize \ -d @sync-request.json

which I can easily translate to UrlFetchApp call, but I don't have an idea to generate access token created by

gcloud auth application-default print-access-token

Is there a way to get it from apps script using service account credentials?

Or is there any other way to auth and access speech-to-text from apps script?

...

ANSWER

Answered 2020-Apr-27 at 19:41

The equivalent of retrieving access tokens through service accounts is through the apps script oauth library. The library handles creation of the JWT token.

Sample here

Source https://stackoverflow.com/questions/61466912

QUESTION

Twilio Base64 Media Payload for Google Speech To Text API not Responding

Asked 2020-Apr-24 at 21:57

I have a need to do some real time transcriptions from twilio phone calls using Google speech-to-text api and I've followed a few demo apps showing how to set this up. My application is in .net core 3.1 and I am using webhooks with a Twilio defined callback method. Upon retrieving the media from Twilio through the callback it is passed as Raw audio in encoded in base64 as you can see here.

https://www.twilio.com/docs/voice/twiml/stream

I've referenced this demo on Live Transcribing as well and am trying to mimic the case statement in the c#. Everything connects correctly and the media and payload is passed into my app just fine from Twilio.

The audio string is then converted to a byte[] to pass to the Task that needs to transcribe the audio

...

ANSWER

Answered 2020-Apr-24 at 21:57

After all this, I discovered that this code works fine, just needs to be broken up and called in different events in the Twilio stream lifecycle. The config section needs to be placed during the connected event. The print messages task needs to be placed in the media event. Then, the WriteCompleteAsync needs to be placed in the stop event when the websocket is closed from Twilio.

One other important item to consider are the number of requests being sent to Google STT to ensure that too many requests aren't overloading the quota which seems to be (for now) 300 requests / minute.

Source https://stackoverflow.com/questions/61217114

QUESTION

R8: NullPointerException during IR Conversion

Asked 2020-Apr-22 at 20:58

I have a problem when trying to generate a release version of my app. It gives a strange error

...

ANSWER

Answered 2020-Apr-22 at 20:58

Answering my own question because it turned out to be an R8 bug and after me reporting it, they solved the issue. Which is great.

Full bug report and how to apply fix is here

Short version:

change gradle configuration to this

Source https://stackoverflow.com/questions/61019845

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install google-speech

You can install using 'npm i google-speech' or download it from GitHub, npm.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: