cognitive-services-speech-sdk | Sample code for the Microsoft Cognitive Services Speech SDK | SDK library
kandi X-RAY | cognitive-services-speech-sdk Summary
kandi X-RAY | cognitive-services-speech-sdk Summary
This project hosts the samples for the Microsoft Cognitive Services Speech SDK. To find out more about the Microsoft Cognitive Services Speech SDK itself, please visit the SDK documentation site.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of cognitive-services-speech-sdk
cognitive-services-speech-sdk Key Features
cognitive-services-speech-sdk Examples and Code Snippets
Community Discussions
Trending Discussions on cognitive-services-speech-sdk
QUESTION
i'm trying to use Azure Cognitive Services Speech to Text and i am hitting a roadblock in .net Core
i have native support for a WAV file using the audioConfig.FromWafFileInput(); which is great.
however i need to also support MP3's
I have found compressed audio support https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-use-codec-compressed-audio-input-streams?tabs=debian&pivots=programming-language-csharp
however this is referencing PushAudio Streams.
this is where i'm getting lost....
i have found this example for stream codec compressed audio https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/cpp/linux/compressed-audio-input/compressed-audio-input.cpp
however this is not C# .net core and conversion is not really my strong suit.
so yeah at a bit of a loss.
any assistance would be greatly appreciated (y)
...ANSWER
Answered 2021-Jun-30 at 14:13This sample: https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/csharp/sharedcontent/console/speech_recognition_samples.cs has compressed audio specific methods here and here. The latter pull stream sample seems pretty straightforward, just plug in your key, region, and filepath.
QUESTION
I want to use the Azure Speech service for speech recognition from the microphone. I have a program running smoothly in Python with recognize_once_async(), this recognizes only the first utterance with a 15-second audio limit though. I did some research on this topic and went over sample code from MS (https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/python/console/speech_sample.py) and couldn't find anything that enables continuous speech recognition from microphone... Any tips?
...ANSWER
Answered 2021-May-25 at 16:19You could try the below code :
QUESTION
I am using azure-speech to recognize audio stream, from speech_recognition_samples.cpp, from class RecognitionResult I only can get the Text and m_duration, but how can I get the begin time and end time of the result in the speech? I know e.Result->Offset()
can return the offset, but I still confused about it, My code is
ANSWER
Answered 2021-Feb-02 at 09:26If you look closely in your outputs - there are two events :
Recognizing and Recognized
Recognizing : The event Recognizing signals that are an intermediate recognition result is received.
Recognized : The event Recognized signals that a final recognition result is received.
So the offset that you see is for the complete sentence (Recognized event - usually before the first pause) : My voice is my passport, verify me. So for all the recognizing (intermediate) event, the offset will be same. So if you had another recognized event, you would see sequential offset. So if you had another sentence in the audio - you are likely to see additional recognized event and the offset - growing like you are expecting.
Update :
Additional Note : The duration grows from zero for every recognized event. The duration count traverses from zero to duration of the complete recognized event.
So for instance
QUESTION
I am using the JavaScript version of Microsoft Cognitive Services Speech SDK from https://github.com/Azure-Samples/cognitive-services-speech-sdk.
The audio is played by the browser when synthesizer.speakTextAsync is called. When the audio is too long I want to stop the audio play but I couldn't find any documentation on how to do that?
Any help is appreciated!
...ANSWER
Answered 2020-Jun-23 at 02:58Stopping audio playing is supported.
You need to create a SpeechSDK.SpeakerAudioDestination()
object and use it to create audioConfig like this.
QUESTION
I am starting to use the Cognitive Services provided by Microsoft. I am particularly interested in Text-To-Speech to synthesise speech into an audio file.
For this reason, in order to test it, I have created the 30days free trial subscription and I have pasted the subscription key into the Quickstart project which Microsoft provides. Nevertheless, when I run the simple code I get the following error:
Speech synthesis canceled, Internal server error. websocket error code: 1011 Did you update the subscription info?
I would like to know why my subscription info, obtained yesterday, which still has 29 days until it expires is not able to connect to the server.
I have tried to contact Microsoft Support but they send me back the same link to the same repository of the Quickstart project.
...ANSWER
Answered 2020-Apr-23 at 07:08I just tried with the trial subscription using this sample, it works fine.
You can try with my API key.
QUESTION
I am using the quickstart template for multi-device conversations and it seems like the participant change event handler (participantsChanged
) does not get fired when a participant disconnects. I would expect to get a LeftConversation
for a participant that closes their browser window or loses internet connection, but it seems like the event is only fired when a participant chooses to disconnect.
ANSWER
Answered 2020-Mar-17 at 17:22The SpeechSDK.ParticipantChangedReason.LeftConversation
event will be fired immediately if the participant leaves the conversation cleanly by clicking the 'Leave conversation' button.
If the participant leaves the conversation by another means such as closing the browser window or clicking the browser's back button, a 'DisconnectSession' message will be immediately triggered in the underlying websocket. This will be elevated to a SpeechSDK.ParticipantChangedReason.LeftConversation
event within 6 minutes. The websocket 'DisconnectSession' message is not currently exposed as a SDK event in the Javascript SDK.
As a workaround, one possibility is to update the Quickstart code to add a listener for the browser 'beforeunload' or 'unload' event which will call the leave conversation function on the participant's behalf.
https://developer.mozilla.org/en-US/docs/Web/API/Window/beforeunload_event https://developer.mozilla.org/en-US/docs/Web/API/Window/unload_event
sample code:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install cognitive-services-speech-sdk
Note: the samples make use of the Microsoft Cognitive Services Speech SDK. By downloading the Microsoft Cognitive Services Speech SDK, you acknowledge its license, see Speech SDK license agreement. Please see the description of each individual sample for instructions on how to build and run it.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page