vadnet | time Voice Activity Detection in Noisy Eniviroments | Machine Learning library

 by   hcmlab Python Version: Current License: LGPL-3.0

kandi X-RAY | vadnet Summary

kandi X-RAY | vadnet Summary

vadnet is a Python library typically used in Artificial Intelligence, Machine Learning, Deep Learning, Pytorch, Tensorflow, Keras applications. vadnet has no bugs, it has no vulnerabilities, it has a Weak Copyleft License and it has low support. However vadnet build file is not available. You can download it from GitHub.

VadNet is a real-time voice activity detector for noisy enviroments. It implements an end-to-end learning approach based on Deep Neural Networks. In the extended version, gender and laughter detection are added. To see a demonstration click on the images below.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              vadnet has a low active ecosystem.
              It has 359 star(s) with 72 fork(s). There are 18 watchers for this library.
              OutlinedDot
              It had no major release in the last 6 months.
              There are 20 open issues and 11 have been closed. On average issues are closed in 70 days. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of vadnet is current.

            kandi-Quality Quality

              vadnet has 0 bugs and 0 code smells.

            kandi-Security Security

              vadnet has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              vadnet code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              vadnet is licensed under the LGPL-3.0 License. This license is Weak Copyleft.
              Weak Copyleft licenses have some restrictions, but you can use them in commercial projects.

            kandi-Reuse Reuse

              vadnet releases are not available. You will need to build from source code and install.
              vadnet has no build file. You will be need to create the build yourself to build the component from source.
              Installation instructions, examples and code snippets are available.
              vadnet saves you 765 person hours of effort in developing the same functionality from scratch.
              It has 1762 lines of code, 144 functions and 35 files.
              It has medium code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed vadnet and discovered the below as its top functions. This is intended to give you an instant insight into vadnet implemented functionality, and help decide if they suit your requirements.
            • Sample from youtube
            • Convert a timestamp to a number of milliseconds
            • Write a voice activity file
            • Sample from an audio file
            • Download files
            • Train the model
            • Convert audio data into frames
            • Load audio from file
            • Generate annotation labels from a csv file
            • Return a list of all URLs in the given path
            • Parse a list of tables
            • Plot filter weights
            • Download filenames
            • Print checkpoint
            • Get all files in root directory
            • Get a weight from a checkpoint
            • Create an instance from a module
            • Update a variable from a checkpoint
            • Parse a list of table entries
            • Get audio
            • Performs dynamic rnn layer
            • Sample the next URL
            • Write a transcription to an annotation file
            • Initialize the audio set
            • Predict from a checkpoint
            • Argument parser
            • Generate a sample
            • Sample from file
            Get all kandi verified functions for this library.

            vadnet Key Features

            No Key Features are available at this moment for vadnet.

            vadnet Examples and Code Snippets

            No Code Snippets are available at this moment for vadnet.

            Community Discussions

            QUESTION

            Why does my convolutional model does not learn?
            Asked 2021-Jun-02 at 12:50

            I am currently working on building a CNN for sound classification. The problem is relatively simple: I need my model to detect whether there is human speech on an audio record. I made a train / test set containing records of 3 seconds on which there is human speech (speech) or not (no_speech). From these 3 seconds fragments I get a mel-spectrogram of dimension 128 x 128 that is used to feed the model.

            Since it is a simple binary problem I thought the a CNN would easily detect human speech but I may have been too cocky. However, it seems that after 1 or 2 epoch the model doesn’t learn anymore, i.e. the loss doesn’t decrease as if the weights do not update and the number of correct prediction stays roughly the same. I tried to play with the hyperparameters but the problem is still the same. I tried a learning rate of 0.1, 0.01 … until 1e-7. I also tried to use a more complex model but the same occur.

            Then I thought it could be due to the script itself but I cannot find anything wrong: the loss is computed, the gradients are then computed with backward() and the weights should be updated. I would be glad you could have a quick look at the script and let me know what could go wrong! If you have other ideas of why this problem may occur I would also be glad to receive some advice on how to best train my CNN.

            I based the script on the LunaTrainingApp from “Deep learning in PyTorch” by Stevens as I found the script to be elegant. Of course I modified it to match my problem, I added a way to compute the precision and recall and some other custom metrics such as the % of correct predictions.

            Here is the script:

            ...

            ANSWER

            Answered 2021-Jun-02 at 12:50
            You are applying 2D 3x3 convolutions to spectrograms.

            Read it once more and let it sink.
            Do you understand now what is the problem?

            A convolution layer learns a static/fixed local patterns and tries to match it everywhere in the input. This is very cool and handy for images where you want to be equivariant to translation and where all pixels have the same "meaning".
            However, in spectrograms, different locations have different meanings - pixels at the top part of the spectrograms mean high frequencies while the lower indicates low frequencies. Therefore, if you have matched some local pattern to a local region in the spectrogram, it may mean a completely different thing if it is matched to the upper or lower part of the spectrogram. You need a different kind of model to process spectrograms. Maybe convert the spectrogram to a 1D signal with 128 channels (frequencies) and apply 1D convolutions to it?

            Source https://stackoverflow.com/questions/67804707

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install vadnet

            do_bin.cmd - Installs embedded Python and downloads SSI interpreter. During the installation the script tries to detect if a GPU is available and possibly installs the GPU version of tensorflow. This requires that a NVIDIA graphic card is detected and CUDA has been installed. Nevertheless, VadNet does fine on a CPU.

            Support

            VadNet is implemented using the Social Signal Interpretation (SSI) framework. The processing pipeline is defined in vad[ex].pipeline and can be configured by editing vad[ex].pipeline-config. Available options are:. If the option send:do is turned on, an XML string with the detection results is streamed to a socket (see send:url). You can change the format of the XML string by editing vad.xml. To run SSI in the background, click on the tray icon and select 'Hide windows'. For more information about SSI pipelines please consult the documentation of SSI.
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/hcmlab/vadnet.git

          • CLI

            gh repo clone hcmlab/vadnet

          • sshUrl

            git@github.com:hcmlab/vadnet.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link