sampen | Partial Python port of PhysioNet SampEn C code
kandi X-RAY | sampen Summary
kandi X-RAY | sampen Summary
Partial Python port of PhysioNet SampEn C code
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Calculate the amplitude of a time series .
- Normalize data .
sampen Key Features
sampen Examples and Code Snippets
Community Discussions
Trending Discussions on sampen
QUESTION
Is there a term for leveraging the fact that data is comprised of a few much-repeated values to speed computation?
As an example when trying to compute Sample Entropy on a long discrete sequence (Length=64.000.000.000, Distinct elements = 11, Length of substring=3) I was finding the running time too long (over 10 minutes). I realised that I should be able to make use of the relatively few distinct elements to speed up computation but was unable to find any literature relating to doing this (I suspect because I don't know what to Google).
The algorithm for Sample Entropy involves counting the pairs of substrings that are within a certain tolerance. This was the computationally expensive aspect of the algorithm O(n^2). By taking only the distinct substrings (of which there were at most 1331) I was able to find the pairs of distinct substrings within the tolerance, I then used the counts of each distinct substring to find the total number of pairs of (non-distinct) substrings that are within a certain tolerance. This method substantially sped up my computation.
Do algorithms that make use of the property of relatively few, much-repeated elements have a specific terminology.
...ANSWER
Answered 2020-Jun-19 at 10:00It's a broad concept with several related terms.
A common, closely related term is Memoization, wherein the results of computing a subproblem for different inputs are stored, and reused when a previously-seen input is re-encountered. That's slightly different from what you're doing here, since memoization is a form of lazy evaluation where values are recognized opportunistically rather than the code performing an up-front exhaustive enumeration of the inputs which will be processed.
Materialization is also worth mentioning. It's encountered in the context of databases, and refers to the results of a query (a.k.a. tabular processing including possible filtering and/or reduction) being stored for reuse. The active concerns with materialization are largely around long-term considerations like dynamic updates, so it's not a perfect match for a run-and-forget algorithm.
Speaking of 'dynamic', one could also maybe describe this as a form of dynamic programming, with a problem solved by exhaustively enumerating and solving a sequence of subproblems. In dynamic programming, though, one expects those subproblems to have a more regular and inductive form, so I think that one's a stretch.
I would describe the precise strategy here as a sort of "eager memoization", to contrast with the lazy-evaluation assumption normally inherent with memoization.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install sampen
You can use sampen like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page