stringmetric | String metrics and phonetic algorithms
kandi X-RAY | stringmetric Summary
kandi X-RAY | stringmetric Summary
#stringmetric String metrics and phonetic algorithms for Scala. The library provides facilities to perform approximate string matching, measurement of string similarity/distance, indexing by word pronunciation, and sounds-like comparisons. In addition to the core library, each metric and algorithm has a command line interface.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of stringmetric
stringmetric Key Features
stringmetric Examples and Code Snippets
Community Discussions
Trending Discussions on stringmetric
QUESTION
I have the following problem. I want to identify strings in java that have a similar meaning. I tried to calculate similarities between strings with Stringmetrics. This works as expected but I need something more convenient.
For example when I have the following 2 strings (1 word):
...ANSWER
Answered 2017-Apr-26 at 14:14Levenshtein distance (edit distance) is like the auto-correct in your phone. Taking your example we have apple
vs appel
. The words are kinda close to each other if you consider adding/removing/replacing a single letter, all we need to do here is swap e
and l
(actually replace e
with l
and l
with e
). If you had other words like applr
or appee
- these are closer to the original word apple
because all you need to do is replace a single letter.
Cosine similiarity is completely different - it counts the words, makes vector of those counts and checks how similiar the counts are, here you have 2 completely different words so it returns 0.
What you want is: combo of those 2 techniques + computer having language knowledge + another dictionary for synonyms that are somehow taken into consideration before and after using those similarity algorithms. Imagine if you had a sentence and then you would replace every single word with synonym (who remembers Joey and Thesaurus?). Sentences could be completely different. Plus every word can have multiple synonyms, and some of those synonyms can be used only in a specific context. Your task is simply impossible as of now, maybe in the future.
P.S. If your task was possible I think that translating software would be basically perfect, but I'm not really sure about that.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install stringmetric
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page