hmni | Fuzzy Name Matching with Machine Learning | Machine Learning library

by Christopher-Thornton Python Version: 0.1.8 License: MIT

X-Ray Key Features Code Snippets Community Discussions(2)Vulnerabilities Install Support

kandi X-RAY | hmni Summary

hmni is a Python library typically used in Institutions, Learning, Education, Artificial Intelligence, Machine Learning, Deep Learning, Pytorch applications. hmni has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has low support. You can install using 'pip install hmni' or download it from GitHub, PyPI.

Fuzzy name matching with machine learning. Perform common fuzzy name matching tasks including similarity scoring, record linkage, deduplication and normalization. HMNI is trained on an internationally-transliterated Latin firstname dataset, where precision is afforded priority. For an introduction to the methodology and research behind HMNI, please refer to my blog post.

Support

Quality

Security

License

Reuse

Support

hmni has a low active ecosystem.

It has 107 star(s) with 13 fork(s). There are 5 watchers for this library.

It had no major release in the last 12 months.

There are 2 open issues and 4 have been closed. On average issues are closed in 15 days. There are 1 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of hmni is 0.1.8

Quality

hmni has 0 bugs and 17 code smells.

Security

hmni has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

hmni code analysis shows 0 unresolved vulnerabilities.

There are 3 security hotspots that need review.

License

hmni is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

hmni releases are available to install and integrate.

Deployable package is available in PyPI.

Build file is available. You can build the component from source.

Installation instructions, examples and code snippets are available.

It has 809 lines of code, 45 functions and 9 files.

It has high code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi has reviewed hmni and discovered the below as its top functions. This is intended to give you an instant insight into hmni implemented functionality, and help decide if they suit your requirements.

Compute similarity between two names
Compute the sum of features between two features
Return the seen set seen in the mapping
Fuzzify the features of a word
Transform variable names into x2 and x2 coordinates
Returns the positive class prediction of the model
Compute the probability for the given value
Runs the siamese_inf
Compute the similarity distribution for a given feature pair
Preprocess a name
Return the category associated with the given category
Builds the dataset
Trim the corpus
Generate word ids
Fits the corpus
Increment the frequency of a given category
Fit the model to the data
Freeze the model
Generate test data set

Get all kandi verified functions for this library.

hmni Key Features

No Key Features are available at this moment for hmni.

hmni Examples and Code Snippets

No Code Snippets are available at this moment for hmni.

Community Discussions

Trending Discussions on hmni

How to separate tuple into independent pandas columns?

Creating a column of match probabilities from hmni package python

QUESTION

How to separate tuple into independent pandas columns?

Asked 2021-Oct-20 at 17:57

I am working with matching two separate dataframes on first name using HMNI's fuzzymerge.

On output each row returns a key like: (May, 0.9905315373004635)

I am trying to separate the Name and Score into their own columns. I tried the below code but don't quite get the right output - every row ends up with the same exact name/score in the new columns.

...

ANSWER

Answered 2021-Oct-20 at 16:54

first when going over rows in pandas is better to use apply

Source https://stackoverflow.com/questions/69650006

QUESTION

Creating a column of match probabilities from hmni package python

Asked 2021-May-17 at 16:19

I have a dataframe that looks like this

...

ANSWER

Answered 2021-May-15 at 22:09

According to hmni's docs, similarity accepts twos strs as its first and second arguments. You are trying to pass two pandas.Series, i.e., df['CEOThisYr'] and df['CEOLastYr']. You could try using pandas.DataFrame.apply to apply similarity to each row.

Source https://stackoverflow.com/questions/67551112

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install hmni

Using PIP via PyPI.

Support

Pull requests are welcome. For developers wishing to build a model using Latin or non-Latin writing systems (Chinese, Cyrillic, Arabic), jupyter notebooks are shared in the dev folder to build models using similar methods.

Find more information at: