fuzzysearch | : pig : Tiny and fast fuzzy search in Go | Search Engine library
kandi X-RAY | fuzzysearch Summary
kandi X-RAY | fuzzysearch Summary
:pig: Tiny and fast fuzzy search in Go
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of fuzzysearch
fuzzysearch Key Features
fuzzysearch Examples and Code Snippets
Community Discussions
Trending Discussions on fuzzysearch
QUESTION
Given the following QueryParser with a FuzzySearch term in the query string:
...ANSWER
Answered 2021-Apr-14 at 20:41This may cross the border into "not an answer" - but it is too long for a comment (or a few comments):
Why is this?
That was a design decision, it would seem. It's mentioned in the documentation here.
"The value is between 0 and 2"
There is an old article here which gives an explanation:
"Larger differences are far more expensive to compute efficiently and are not processed by Lucene.".
I don't know how official that is, however.
More officially, from the JavaDoc for the FuzzyQuery
class, it states:
"At most, this query will match terms up to 2 edits. Higher distances (especially with transpositions enabled), are generally not useful and will match a significant amount of the term dictionary."
How can I correctly get the fuzzy edit distance I want into the query parser?
You cannot, unless you customize the source code.
The best (least worst?) alternative, I think, is probably the one mentioned in the above referenced FuzzyQuery
Javadoc:
"If you really want this, consider using an n-gram indexing technique (such as the SpellChecker in the suggest module) instead."
In this case, one price to be paid will be a potentially much larger index - and even then, n-grams are not really equivalent to edit distances. I don't know if this would meet your needs.
QUESTION
I have a number of enterprise datasets that I must find missing links between, and one of the ways I use for finding potential matches is joining on first and last name. The complication is that we have a significant number of people who use their legal name in one dataset (employee records), but they use either a nickname or (worse yet) their middle name in others (i.e., EAD, training, PIV card, etc.). I am looking for a way to match up these potentially disparate names across the various datasets.
Simplified ExampleHere is an overly simplified example of what I am trying to do, but I think it conveys my thought process. I begin with the employee table:
Employees table employee_id first_name last_name 052451 Robert Armsden 442896 Jacob Craxford 054149 Grant Keeting 025747 Gabrielle Renton 071238 Margaret Seifenmacherand try to find the matching data from the PIV card dataset:
Cards table card_id first_name last_name 1008571527 Bobbie Armsden 1009599982 Jake Craxford 1004786477 Gabi Renton 1000628540 Maggy Seifenmacher Desired ResultAfter trying to match these datasets on first name and last name, I would like to end up with the following:
Employees_Cards table emp_employee_id emp_first_name emp_last_name crd_card_id crd_first_name crd_last_name 052451 Robert Armsden 1008571527 Bobbie Armsden 442896 Jacob Craxford 1009599982 Jake Craxford 054149 Grant Keeting NULL NULL NULL 025747 Gabrielle Renton 1004786477 Gabi Renton 071238 Margaret Seifenmacher 1000628540 Maggy SeifenmacherAs you can see, I would like to make the following matches:
Gabrielle -> Gabi
Jacob -> Jacob
Margaret -> Maggy
Robert -> Bobbie
My initial thought was to find a common names dataset along the lines of:
Name_Aliases table name1 name2 name3 name4 Gabrielle Gabi NULL NULL Jacob Jake NULL NULL Margaret Maggy Maggie Meg Michael Mike Mikey Mick Robert Bobbie Bob Roband use something like this for the JOIN:
...ANSWER
Answered 2021-Mar-20 at 01:10How to structure and query and the aliases table is an interesting question. I'd suggest organizing it in pairs rather than wider rows, because you don't know in advance how many variations may eventually be needed in a group of connected names, and a two column structure gives you the ability to add to a given group indefinitely:
name1 name2 Jacob Jake Margaret Maggy Margaret Maggie Margaret Meg Maggy Maggie Maggy Meg Maggie MegThen you just check both columns in each JOIN in the query, something like this:
QUESTION
I have a dnd selection tab with search functionality to filter the items available. Once an item is selected from draggable section it should be removed from that section and dropped to the droppable section and vice versa. The link to my code is https://codesandbox.io/s/dnd-search-select-sort-xfdtn When an item say "Apple" is selected it is going to the droppable section but when I search for "Apple" again in the draggable section search bar it reappears again and I can again move that to the droppable section which should not be the case. Once it is selected it should not appear again on the list. Below is the corresponding code.
...ANSWER
Answered 2020-May-21 at 02:55In your state you create Fuse
instance once with full list of items
. In your handleItemSearch
you always search against full list of items
. Hence the issue.
To solve the issue, create fresh instance of Fuse
in your handleItemSearch
QUESTION
I'm building a help page with Gatsby and have a search bar (Searchbar.js) where I'm trying to pass the user's input in the field (search bar is always present within the page--think like Evernote's help page) to a component that conducts the search (search.js), which then passes that output to the actual results page (SearchResults.js).
When I do gatsby develop
everything works as it should, but when I do a gatsby build
I get an error where it says it cant read the property "query" because its undefined (line 63 of search.js: var search = location.state.query.trim()
). Why is this failing on build?
Searchbar.js
...ANSWER
Answered 2020-Apr-28 at 17:03location
is short for window.location
, but at build-time your code is running in Node.js which does not have a window
. Instead consider testing for the existence of window
(typeof window !== "undefined"
) before running your location.state.query.trim
call, and fall back to a default value in the case that window
does not exist.
QUESTION
I have couple of questions here.
I want to search a term jumps
With Fuzzy search, I can do jump~
With wild card search, I can do jump*
With stemmer I can do, jump
My understanding is that, fuzzy search gives pump
. Wildcard search gives jumping
as well. Stemmer gives "jumper" also.
I totally agree with the results.
What is the performance of thes three?
Wild card is not recommended if it is at the beginning of the term - my understanding as it has to match with all the tokens in the index - But in this case, it would be all the tokens which starts jump
Fuzzy search gives me unpredicted results - It has to do something kind of spellcheck I assume.
Stemmer suits only particular scenarios like it can;t match pumps.
How should I use these things which can give more relevant results?
I probably more confused about all these because of this section. Any suggestions please?
...ANSWER
Answered 2020-Apr-10 at 08:08For question 2 you can go strict to permissive.
Option one: Only give strict search result. If no result found give stemmer results. Continue with fuzzy or wildcard search if no result found previously.
Option two: Give all results but rank them by level (ie. first exact match, then stemmer result, ...)
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install fuzzysearch
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page