speech-javascript-sdk | IBM Watson Speech to Text and Text to Speech services | Speech library
kandi X-RAY | speech-javascript-sdk Summary
kandi X-RAY | speech-javascript-sdk Summary
IBM Watson Speech Services for Web Browsers. Allows you to easily add voice recognition and synthesis to any web app with minimal code.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of speech-javascript-sdk
speech-javascript-sdk Key Features
speech-javascript-sdk Examples and Code Snippets
Community Discussions
Trending Discussions on speech-javascript-sdk
QUESTION
I'm learning to use the Watson Speech JS SDK. In particular I like Transcribe from Microphone, with Alternatives. I'm generating my token with a Firebase Cloud Function. I'm using AngularJS, not JQuery. The first problem I'm running into is
...ANSWER
Answered 2020-Apr-03 at 15:51Version 0.37.0 of the SDK introduced breaking changes:
QUESTION
I am using following javascript to record audio and send it to a websocket server:
...ANSWER
Answered 2018-Oct-18 at 14:25To do realtime downsampling follow these steps:
First get stream instance using this:
QUESTION
It seems that the WebSocket API in Safari 10.1 has a maximum amount of binary data that it can buffer and then the next message sent gets the error "WebSocket connection to ... failed: Failed to send WebSocket frame."
Safari then closes the connection with code 1006 (CLOSE_ABNORMAL).
WebSockets are supposed to report the bufferedAmount
- but Safari always reports 0
until after the error occurs and the connection is closed.
I tried just doing 100ms a setTimeout between each message, and that seems to work in the case of small chunks of data, but it seems brittle and large chunks still get errors when I send my closing JSON message, even with a longer delay.
You can see the bug in action here - the "Play Sample" buttons work in Safari 10.03 but error in 10.1. (Code that handles the WebSocket connection.)
Any ideas on how to work around this? Or what the limit even is? I know that Safari is Open Source, but I'm not sure where to look.
Update: here's a simpler example:
...ANSWER
Answered 2017-Aug-19 at 03:36I tried your real world link on Safari 10.1.2 and did not see the problem. Seems it has been fixed.
QUESTION
I'm seeing the word transcriptions, either in the browser or in the console, but I'm not seeing the messages such as {'state': 'listening'}
. More importantly, I'm not seeing the results such as {"results": [{"alternatives": [{"transcript": "name the mayflower "}],"final": true}],"result_index": 0}
.
I read the RecognizeStream documentation and tried this code:
...ANSWER
Answered 2017-Aug-09 at 22:56The recognizeMicrophone()
method is a helper that chains together a number of streams. The message
event is fired on one of the streams in the middle. But, you can get access to that one at stream.recognizeStream
- it's always attached to the last one in the chain in order to support cases like this.
So, in your code, it should look something like this:
QUESTION
I'm working through the tutorial for IBM Watson Speech-to-Text, using WebSocket for real time transcription. I'm using Angular.
The first 25 lines of code are copied from the API reference. This code successfully connects and initiates a recognition request. Watson sends me a message { "state": "listening" }
.
I wrote function onClose()
that logs when the connection closes.
I made a button that runs the handler $scope.startSpeechRecognition
. This uses getUserMedia()
to stream audio from the microphone and websocket.send()
to stream the data to Watson. This isn't working. Clicking this button closes the connection. I presume that I'm sending the wrong type of data and Watson is closing the connection?
I moved websocket.send(blob);
from onOpen
to my handler $scope.startSpeechRecognition
. I changed websocket.send(blob);
to websocket.send(mediaStream);
. I might have this wrong: 'content-type': 'audio/l16;rate=22050'
. How do I know what bit rate comes from the microphone?
Is there a tutorial for JavaScript? When I google "IBM Watson Speech-to-Text JavaScript tutorial" at the top is an 8000-line SDK. Is the SDK required or can I write a simple program to learn how the service works?
Here's my controller:
...ANSWER
Answered 2017-Aug-08 at 18:46The SDK is not required, but as Geman Attanasio said, it does make your life much easier.
Onto your code, though, this line definitely won't work:
websocket.send(mediaStream);
The mediaStream object from getUserMedia()
cannot be directly sent over the WebsSocket - WebSockets only accept text and binary data (the blob
in the original example). You have to extract the audio and then send only it.
But even that isn't sufficient in this case, because the WebAudio API provides the audio in 32-bit floats, which is not a format that the Watson API natively understands. The SDK automatically extracts and converts it to audio/l16;rate=16000
(16-bit ints).
How do I know what bit rate comes from the microphone?
It's available on the AudioContext and, if you add a scriptProcessorNode, it can be passed AudioBuffers that include audio data and the sample rate. Multiply the sample rate by the size of each sample (32 bits before conversion to l16, 16 bits after) by the number of channels (usually 1) to get the bit rate.
BUT note that the number you put into the content-type under after rate=
is the sample rate, not the bit rate. So you could just copy it from the AudioContext or AudioBuffer without multiplication. (Unless you down-sample the audio, as the SDK does. Then it should be set to the target sample-rate is, not the input rate.)
If you want to see how all of this works, the entire SDK is open source:
- Extracting audio from mediaStream: https://github.com/saebekassebil/microphone-stream/blob/master/microphone-stream.js
- Converting & down-sampling: https://github.com/watson-developer-cloud/speech-javascript-sdk/blob/master/speech-to-text/webaudio-l16-stream.js
- Managing the WebSocket: https://github.com/watson-developer-cloud/speech-javascript-sdk/blob/master/speech-to-text/recognize-stream.js
Familiarity with the Node.js Streams standard is helpful when reading these files.
FWIW, if you're using a bundling system like Browserify or Webpack, you can pick and choose only the parts of the SDK you need and get a much smaller file size. You can also set it up to download after the page loads and renders since the SDK won't be part of your initial render.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install speech-javascript-sdk
This library can be bundled with browserify or Webpack and easy included in larger projects:.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page