ukkonen | Ukkonen 's Approximate String Matching algorithm
kandi X-RAY | ukkonen Summary
kandi X-RAY | ukkonen Summary
Ukkonen's Approximate String Matching algorithm
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of ukkonen
ukkonen Key Features
ukkonen Examples and Code Snippets
Community Discussions
Trending Discussions on ukkonen
QUESTION
I am currently using a trie implementation from this stack overflow post:
Getting a list of words from a Trie
to return a list of words which match a given prefix. I'm then using regex to filter out the words which don't meet the entire specified pattern.
EX: if the pattern I'm searching for is: CH??S? and this is a subset of the dictionary which matches my initial prefix: {CHABAD, CHACHA, CHARIOT, CHATTED, CHEATER, CHOMSKY, CHANNEL CHAFED, CHAFER, CHAINS, CHAIRS, CHEESE, CHEESY CHRONO, CHUTES, CHISEL}
I would search the trie with 'CH' prefix and then filter out words which match my desired pattern of CH??S? (CHEESY, CHEESE, CHISEL) and return those.
I am wondering if there is a faster way to do this to avoid using the regex in the final step. I thought I could use a suffix tree (Ukkonen's suffix tree algorithm in plain English )or the boyer-moore algorithm but neither work because they search on suffixes not on patterns.
...ANSWER
Answered 2020-Apr-22 at 19:02Here's a nice recursive algorithm you can use that eliminates the need to use a final regex pass. It works by matching a pattern P against a tree T:
QUESTION
I'm working with suffix trees. As far as I can tell, I have Ukkonen's algorithm running correctly to build a generalised suffix tree from an arbitrary number of strings. I'm now trying to implement a find_longest_common_substring()
method to do exactly that. For this to work, I understand that I need to find the deepest shared edge (with depth in terms of characters, rather than edges) between all strings in the tree, and I've been struggling for a few days to get the traversal right.
Right now I have the following in C++. I'll spare you all my code, but for context, I'm keeping the edges of each node in an unordered_map called outgoing_edges
, and each edge has a vector of ints recorded_strings
containing integers identifying the added strings. The child
field of an edge is the node it is going to, and l
and r
identify its left and rightmost indices, respectively. Finally, current_string_number
is the current number of strings in the tree.
ANSWER
Answered 2017-Aug-19 at 20:40Your handling of deepest_shared_edge
is wrong. First, the allocation you do at the start of the function is a memory leak, since you never free the memory. Secondly, the result of the recursive call is ignored, so whatever deepest edge it finds is lost (although you update the depth, you don't keep track of the deepest edge).
To fix this, you should either pass deepest_shared_edge
as a reference parameter (like you do for longest
), or you can initialize it to nullptr
, then check the return from your recursive call for nullptr
and update it appropriately.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install ukkonen
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page