kandi background

speechbrain | A PyTorchbased Speech Toolkit | Machine Learning library

Download this library from

kandi X-RAY | speechbrain Summary

speechbrain is a Python library typically used in Artificial Intelligence, Machine Learning, Deep Learning, Pytorch applications. speechbrain has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has medium support. You can download it from GitHub.
SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech technologies, including systems for speech recognition, speaker recognition, speech enhancement, multi-microphone signal processing and many others.

kandi-support Support

  • speechbrain has a medium active ecosystem.
  • It has 3933 star(s) with 754 fork(s). There are 109 watchers for this library.
  • There were 1 major release(s) in the last 6 months.
  • There are 135 open issues and 513 have been closed. On average issues are closed in 102 days. There are 45 open pull requests and 0 closed requests.
  • It has a neutral sentiment in the developer community.
  • The latest version of speechbrain is v0.5.11

quality kandi Quality

  • speechbrain has 0 bugs and 0 code smells.

securitySecurity

  • speechbrain has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
  • speechbrain code analysis shows 0 unresolved vulnerabilities.
  • There are 0 security hotspots that need review.

license License

  • speechbrain is licensed under the Apache-2.0 License. This license is Permissive.
  • Permissive licenses have the least restrictions, and you can use them in most projects.

buildReuse

  • speechbrain releases are available to install and integrate.
  • Build file is available. You can build the component from source.
  • Installation instructions, examples and code snippets are available.
  • It has 56498 lines of code, 2479 functions and 336 files.
  • It has high code complexity. Code complexity directly impacts maintainability of the code.
Top functions reviewed by kandi - BETA

kandi has reviewed speechbrain and discovered the below as its top functions. This is intended to give you an instant insight into speechbrain implemented functionality, and help decide if they suit your requirements.

  • Return the split for the given split option .
  • Creates the metadata for a given dataset .
  • Create a mixture of audio files .
  • Prepare TAS data .
  • Prepares the Fisher - Call home corpus .
  • prepare the Syshell1mix directory
  • Perform a forward step .
  • Transducer beam search .
  • Diarize a dataset .
  • Prepare GSC .

speechbrain Key Features

Various pretrained models nicely integrated with (HuggingFace) in our official organization account. These models are given with an interface to easily run inference, facilitating integration. If a HuggingFace model isn't available, we usually provide a least a Google Drive folder containing all the experimental results corresponding.

The Brain class, a fully-customizable tool for managing training and evaluation loops over data. The annoying details of training loops are handled for you while retaining complete flexibility to override any part of the process when needed.

A YAML-based hyperparameter specification language that describes all types of hyperparameters, from individual numbers (e.g. learning rate) to complete objects (e.g. custom models). This dramatically simplifies recipe code by distilling basic algorithmic components.

Multi-GPU training and inference with PyTorch Data-Parallel or Distributed Data-Parallel.

Mixed-precision for faster training.

A transparent and entirely customizable data input and output pipeline. SpeechBrain follows the PyTorch data loader and dataset style and enables users to customize the i/o pipelines (e.g adding on-the-fly downsampling, BPE tokenization, sorting, threshold ...).

A nice integration of sharded data with WebDataset optimized for very large datasets on Nested File Systems (NFS).

speechbrain Examples and Code Snippets

  • Install via PyPI
  • Install with GitHub
  • Test Installation
  • Running an experiment
  • Citing SpeechBrain

Install via PyPI

pip install speechbrain

Community Discussions

Trending Discussions on Machine Learning
  • Using RNN Trained Model without pytorch installed
  • Flux.jl : Customizing optimizer
  • How can I check a confusion_matrix after fine-tuning with custom datasets?
  • CUDA OOM - But the numbers don't add upp?
  • How to compare baseline and GridSearchCV results fair?
  • Getting Error 524 while running jupyter lab in google cloud platform
  • TypeError: brain.NeuralNetwork is not a constructor
  • Ordinal Encoding or One-Hot-Encoding
  • How to increase dimension-vector size of BERT sentence-transformers embedding
  • How to identify what features affect predictions result?
Trending Discussions on Machine Learning

QUESTION

Using RNN Trained Model without pytorch installed

Asked 2022-Feb-28 at 20:17

I have trained an RNN model with pytorch. I need to use the model for prediction in an environment where I'm unable to install pytorch because of some strange dependency issue with glibc. However, I can install numpy and scipy and other libraries. So, I want to use the trained model, with the network definition, without pytorch.

I have the weights of the model as I save the model with its state dict and weights in the standard way, but I can also save it using just json/pickle files or similar.

I also have the network definition, which depends on pytorch in a number of ways. This is my RNN network definition.

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import random

torch.manual_seed(1)
random.seed(1)
device = torch.device('cpu')

class RNN(nn.Module):
  def __init__(self, input_size, hidden_size, output_size,num_layers, matching_in_out=False, batch_size=1):
    super(RNN, self).__init__()
    self.input_size = input_size
    self.hidden_size = hidden_size
    self.output_size = output_size
    self.num_layers = num_layers
    self.batch_size = batch_size
    self.matching_in_out = matching_in_out #length of input vector matches the length of output vector 
    self.lstm = nn.LSTM(input_size, hidden_size,num_layers)
    self.hidden2out = nn.Linear(hidden_size, output_size)
    self.hidden = self.init_hidden()
  def forward(self, feature_list):
    feature_list=torch.tensor(feature_list)
    
    if self.matching_in_out:
      lstm_out, _ = self.lstm( feature_list.view(len( feature_list), 1, -1))
      output_space = self.hidden2out(lstm_out.view(len( feature_list), -1))
      output_scores = torch.sigmoid(output_space) #we'll need to check if we need this sigmoid
      return output_scores #output_scores
    else:
      for i in range(len(feature_list)):
        cur_ft_tensor=feature_list[i]#.view([1,1,self.input_size])
        cur_ft_tensor=cur_ft_tensor.view([1,1,self.input_size])
        lstm_out, self.hidden = self.lstm(cur_ft_tensor, self.hidden)
        outs=self.hidden2out(lstm_out)
      return outs
  def init_hidden(self):
    #return torch.rand(self.num_layers, self.batch_size, self.hidden_size)
    return (torch.rand(self.num_layers, self.batch_size, self.hidden_size).to(device),
            torch.rand(self.num_layers, self.batch_size, self.hidden_size).to(device))

I am aware of this question, but I'm willing to go as low level as possible. I can work with numpy array instead of tensors, and reshape instead of view, and I don't need a device setting.

Based on the class definition above, what I can see here is that I only need the following components from torch to get an output from the forward function:

  • nn.LSTM
  • nn.Linear
  • torch.sigmoid

I think I can easily implement the sigmoid function using numpy. However, can I have some implementation for the nn.LSTM and nn.Linear using something not involving pytorch? Also, how will I use the weights from the state dict into the new class?

So, the question is, how can I "translate" this RNN definition into a class that doesn't need pytorch, and how to use the state dict weights for it? Alternatively, is there a "light" version of pytorch, that I can use just to run the model and yield a result?

EDIT

I think it might be useful to include the numpy/scipy equivalent for both nn.LSTM and nn.linear. It would help us compare the numpy output to torch output for the same code, and give us some modular code/functions to use. Specifically, a numpy equivalent for the following would be great:

rnn = nn.LSTM(10, 20, 2)
input = torch.randn(5, 3, 10)
h0 = torch.randn(2, 3, 20)
c0 = torch.randn(2, 3, 20)
output, (hn, cn) = rnn(input, (h0, c0))

and also for linear:

m = nn.Linear(20, 30)
input = torch.randn(128, 20)
output = m(input)

ANSWER

Answered 2022-Feb-17 at 10:47

You should try to export the model using torch.onnx. The page gives you an example that you can start with.

An alternative is to use TorchScript, but that requires torch libraries.

Both of these can be run without python. You can load torchscript in a C++ application https://pytorch.org/tutorials/advanced/cpp_export.html

ONNX is much more portable and you can use in languages such as C#, Java, or Javascript https://onnxruntime.ai/ (even on the browser)

A running example

Just modifying a little your example to go over the errors I found

Notice that via tracing any if/elif/else, for, while will be unrolled

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import random

torch.manual_seed(1)
random.seed(1)
device = torch.device('cpu')

class RNN(nn.Module):
  def __init__(self, input_size, hidden_size, output_size,num_layers, matching_in_out=False, batch_size=1):
    super(RNN, self).__init__()
    self.input_size = input_size
    self.hidden_size = hidden_size
    self.output_size = output_size
    self.num_layers = num_layers
    self.batch_size = batch_size
    self.matching_in_out = matching_in_out #length of input vector matches the length of output vector 
    self.lstm = nn.LSTM(input_size, hidden_size,num_layers)
    self.hidden2out = nn.Linear(hidden_size, output_size)
  def forward(self, x, h0, c0):
    lstm_out, (hidden_a, hidden_b) = self.lstm(x, (h0, c0))
    outs=self.hidden2out(lstm_out)
    return outs, (hidden_a, hidden_b)
  def init_hidden(self):
    #return torch.rand(self.num_layers, self.batch_size, self.hidden_size)
    return (torch.rand(self.num_layers, self.batch_size, self.hidden_size).to(device).detach(),
            torch.rand(self.num_layers, self.batch_size, self.hidden_size).to(device).detach())

# convert the arguments passed during onnx.export call
class MWrapper(nn.Module):
    def __init__(self, model):
        super(MWrapper, self).__init__()
        self.model = model;
    def forward(self, kwargs):
        return self.model(**kwargs)

Run an example

rnn = RNN(10, 10, 10, 3)
X = torch.randn(3,1,10)
h0,c0  = rnn.init_hidden()
print(rnn(X, h0, c0)[0])

Use the same input to trace the model and export an onnx file


torch.onnx.export(MWrapper(rnn), {'x':X,'h0':h0,'c0':c0}, 'rnn.onnx', 
                  dynamic_axes={'x':{1:'N'},
                               'c0':{1: 'N'},
                               'h0':{1: 'N'}
                               },
                  input_names=['x', 'h0', 'c0'],
                  output_names=['y', 'hn', 'cn']
                 )

Notice that you can use symbolic values for the dimensions of some axes of some inputs. Unspecified dimensions will be fixed with the values from the traced inputs. By default LSTM uses dimension 1 as batch.

Next we load the ONNX model and pass the same inputs

import onnxruntime
ort_model = onnxruntime.InferenceSession('rnn.onnx')
print(ort_model.run(['y'], {'x':X.numpy(), 'c0':c0.numpy(), 'h0':h0.numpy()}))

Source https://stackoverflow.com/questions/71146140

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install speechbrain

SpeechBrain is constantly evolving. New features, tutorials, and documentation will appear over time. SpeechBrain can be installed via PyPI to rapidly use the standard library. Moreover, a local installation can be used by those users that what to run experiments and modify/customize the toolkit. SpeechBrain supports both CPU and GPU computations. For most all the recipes, however, a GPU is necessary during training. Please note that CUDA must be properly installed to use GPUs.
Once you have created your Python environment (Python 3.8+) you can simply type:.
Once you have created your Python environment (Python 3.8+) you can simply type:.
Please, run the following script to make sure your installation is working:.

Support

SpeechBrain is designed to speed-up research and development of speech technologies. Hence, our code is backed-up with three different levels of documentation:.

Build your Application

Share this kandi XRay Report