DTW | Dynamic Time Warping Algorithm can be used to measure | Time Series Database library

by heshanera C++ Version: Current License: No License

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | DTW Summary

DTW is a C++ library typically used in Database, Time Series Database applications. DTW has no bugs, it has no vulnerabilities and it has low support. You can download it from GitHub.

Dynamic Time Warping Algorithm can be used to measure similarity between 2 time series. Originally designed to use in automatic speech recognition. Objective of the algorithm is to find the optimal global alignment between the two time series, by exploiting temporal distortions between the 2 time series.

Support

Quality

Security

License

Reuse

Support

DTW has a low active ecosystem.

It has 6 star(s) with 3 fork(s). There are 2 watchers for this library.

It had no major release in the last 6 months.

There are 1 open issues and 0 have been closed. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of DTW is current.

Quality

DTW has no bugs reported.

Security

DTW has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

DTW does not have a standard license declared.

Check the repository for any license declaration and review the terms closely.

Without a license, all rights are reserved, and you cannot use the library in your applications.

Reuse

DTW releases are not available. You will need to build from source code and install.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of DTW

Get all kandi verified functions for this library.

DTW Key Features

No Key Features are available at this moment for DTW.

DTW Examples and Code Snippets

No Code Snippets are available at this moment for DTW.

Community Discussions

Trending Discussions on DTW

What is the correct way to format the parameters for DTW in Similarity Measures?

How to output filename after loading the file in Python?

Reduce function in c

Dynamic Time Warping (DTW) monotonicity constraint

Dynamic time warping in C

kNN-DTW time complexity

Speech Recognition with MFCC and DTW

How to apply/implement Dynamic Time Warping (DTW) or Fast Dynamic Time Warping (FastDTW) in python between 3 or more signals?

Convert a cross-distance matrix to a distance matrix

Adding labels to seaborn bars

QUESTION

What is the correct way to format the parameters for DTW in Similarity Measures?

Asked 2021-Jun-01 at 17:44

I am trying to use the DTW algorithm from the Similarity Measures library. However, I get hit with an error that states a 2-Dimensional Array is required. I am not sure I understand how to properly format the data, and the documentation is leaving me scratching my head.

https://github.com/cjekel/similarity_measures/blob/master/docs/similaritymeasures.html

According to the documentation the function takes two arguments (exp_data and num_data ) for the data set, which makes sense. What doesn't make sense to me is:

exp_data : array_like

Curve from your experimental data. exp_data is of (M, N) shape, where M is the number of data points, and N is the number of dimensions

This is the same for both the exp_data and num_data arguments.

So, for further clarification, let's say I am implementing the fastdtw library. It looks like this:

...

ANSWER

Answered 2021-Jun-01 at 17:44

It appears the solution in my case was to include the index in the array. For example, if your data looks like this:

Source https://stackoverflow.com/questions/67744927

QUESTION

How to output filename after loading the file in Python?

Asked 2021-Apr-17 at 11:14

I am doing a CNN project to estimate pitch from a spectrogram. The project is already finished and waiting to be presented to my institution, but I would like to improve a little detail to my work.

CNN that I have built must be tested using test (val) datas. I store the data in Google Drive (I build my CNN using Google Colab), and before doing testing with do_test method, I have to load the data. Path are given in the code snippet below.

I am able to load the data. The loaded data is then tested, but my problem is, I do not know which file that I have loaded. Testing result is exported to a Python DataFrame. DataFrame example output is attached here.

I want to put the filename of each file loaded and then put it to Dataframe output result, so I will know each output of a specific file. Right now, I can only know the overall data from 1 song without knowing which file is which (see DataFrame). What should I add to my code to get the filenames? Should I modify image_data code in function get_image_and_label?

Some (I hope) useful items:

Folder containing image. Folder link is given here
Folder containing labels can be accessed here
DataFrame output screenshot

Method do_test. This method is used to execute testing.

...

ANSWER

Answered 2021-Apr-17 at 11:14

I have found the solution. it turns out that at this loop declaration:

Source https://stackoverflow.com/questions/67128649

QUESTION

Reduce function in c

Asked 2021-Mar-30 at 08:00

So I'm trying learn how to reduce my lines of code, and I came across one of my "larger" functions I wanted to look at.

...

ANSWER

Answered 2021-Mar-25 at 16:39

A lot of your code can be reduced if you make use if min and max macros:

Source https://stackoverflow.com/questions/66803599

QUESTION

Dynamic Time Warping (DTW) monotonicity constraint

Asked 2021-Mar-25 at 22:30

How to specify a monotonicity constraint (that one time series should not come before the other) when using dynamic time warping?

For example, I have cost and revenue data; one should impact the other but not vice versa. I am using the basic dtw package but I know that there are many others that could be better. Below is my current alignment.

(I would like to save the corresponding revenue point into a separate column, would that be possible?)

...

ANSWER

Answered 2021-Mar-25 at 22:30

I think you can enforce this by defining your own window function. For example, take these series:

Source https://stackoverflow.com/questions/66660740

QUESTION

Dynamic time warping in C

Asked 2021-Mar-23 at 18:48

So I can find alot of guides on DTW for python, and they work as they should. But I need the code translatet into C, but it's over a year since I've written C code.

So in C code I have these two arrays

...

ANSWER

Answered 2021-Mar-23 at 18:48

A C implementation of dynamic time warping is in https://github.com/wannesm/dtaidistance/tree/master/dtaidistance/lib/DTAIDistanceC/DTAIDistanceC

You can always translate python to C using Cython https://people.duke.edu/~ccc14/sta-663/FromPythonToC.html however the generated code sometimes does not work, complete rewriting is better

Source https://stackoverflow.com/questions/66768974

QUESTION

kNN-DTW time complexity

Asked 2021-Mar-18 at 08:59

I found from various online sources that the time complexity for DTW is quadratic. On the other hand, I also found that standard kNN has linear time complexity. However, when pairing them together, does kNN-DTW have quadratic or cubic time?

In essence, does the time complexity of kNN solely depend on the metric used? I have not found any clear answer for this.

...

ANSWER

Answered 2021-Mar-18 at 08:59

You need to be careful here. Let's say you have n time series in your 'training' set (let's call it this, even though you are not really training with kNN) of length l. Computing the DTW between a pair of time series has a asymptotic complexity of O(l * m) where m is your maximum warping window. As m <= l also O(l^2) holds. (although there might be more efficient implementations, i don't think they are actually faster in practice in most cases, see here). Classifying a time series using kNN requires you to compute the distance between that time series and all time series in the training set which would mean n comparisons, linear with respect to n.

So your final complexity would be in O(l * m * n) or O(l^2 * n). In words: the complexity is quadratic with respect to time series length and linear with respect to the number of training examples.

Source https://stackoverflow.com/questions/66687205

QUESTION

Speech Recognition with MFCC and DTW

Asked 2021-Feb-18 at 08:52

So, Basically i had tons of data which word-based dataset. Each of data is absolutely having different length of time.

This is my Approach :

Labelling the given dataset
Split the data using Stratified KFold for Training Data (80%) and Testing data (20%)
Extract the Amplitude, Frequency and Time using MFCC
Because the Time-series each of the data from MFCC extraction are different, i wanted to make all of the data time dimension length are exactly the same using DTW.
Then i will use the DTW data to Train it with Neural Network.

My Question is :

Does my Approach especially in the 4th step are correct?
If My approach was correct then, How can i convert each audio to be the same length with DTW? Because basically i only can compare two audio of MFCC data and when i tried to change to the other audio data the result of the length will absolutely different.

...

ANSWER

Answered 2021-Feb-18 at 08:52

Ad 1) Labelling

I am not sure what you mean by "labelling" the dataset. Nowadays, all you need for ASR is an utterance and the corresponding text (search e.g. for CommonVoice to get some data). This depends on the model you're using, but neural networks do not require any segmentation or additional labeling etc for this task.

Ad 2) KFold cross-validation

Doing cross-validation never hurts. If you have the time and resources to test your model, go ahead and use cross-validation. I, in my case, just make the test set large enough to make sure I get a representative word-error-rate (WER). But that's mostly because training a model k-times is quite an effort as ASR-models usually take some time to train. There are datasets such as Librispeech (and others) which already have a train/test/dev split for you available. If you want, you can compare your results with academic results. It can be hard though if they used a lot of computational power (and data) which you cannot match so bear that in mind when comparing results.

Ad 3) MFCC Features

MFCC work fine but from my experience and what I found out by reading through literature etc, using the log-Mel-spectrogram is slightly more performant using neural networks. It's not a lot of work to test them both so you might want to try log-Mel as well.

Ad 4) and 5) DTW for same length

If you use a neural network, e.g. a CTC model or a Transducer, or even a Transformer, you don't need to do that. The audio inputs do not require to have the same lengths. Just one thing to keep in mind: If you train your model, make sure your batches do not contain too much padding. You want to use some bucketing like bucket_by_sequence_length().

Just define a batch-size as "number of spectrogram frames" and then use bucketing in order to really make use of the memory you got available. This can really make a huge difference for the quality of model. I learned that the hard way.

Note

You did not specify your use-case so I'll just mention the following: You need to know what you want to do with your model. If the model is supposed to be able to consume an audio-stream s.t. a user can talk arbitrarily long, you need to know and work towards that from the beginning.

Another approach would be: "I only need to transcribe short audio segments." e.g. 10 to 60 seconds or so. In that case you can simply train any Transformer and you'll get pretty good results thanks to its attention mechanism. I recommend to go that road if that's all you need because this is considerably easier. But keep away from this if you need to be able to stream audio content for a much longer time.

Things get a lot more complicated when it comes to streaming. Any purely encoder-decoder attention based model is going to require a lot of effort in order to make this work. You can use RNNs (e.g. RNN-T) but these models can become incredibly huge and slow and will require additional efforts to make them reliable (e.g. language model, beam-search) because they lack the encoder-decoder attention. There are other flavors that combine Transformers with Transducers but if you want to write all this on your own, alone, you're taking on quite a task.

Vulnerabilities

No vulnerabilities reported

Install DTW

You can download it from GitHub.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: