Text-Search-Engine | crappy search engine for text files

by logicx24 Python Version: Current License: MIT

X-Ray Key Features Code Snippets Community Discussions(1)Vulnerabilities Install Support

kandi X-RAY | Text-Search-Engine Summary

Text-Search-Engine is a Python library. Text-Search-Engine has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. However Text-Search-Engine build file is not available. You can download it from GitHub.

A search engine for textfiles. Made on a plane flight.

Support

Quality

Security

License

Reuse

Support

Text-Search-Engine has a low active ecosystem.

It has 89 star(s) with 49 fork(s). There are 11 watchers for this library.

It had no major release in the last 6 months.

There are 2 open issues and 0 have been closed. On average issues are closed in 754 days. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of Text-Search-Engine is current.

Quality

Text-Search-Engine has 0 bugs and 0 code smells.

Security

Text-Search-Engine has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

Text-Search-Engine code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

Text-Search-Engine is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

Text-Search-Engine releases are not available. You will need to build from source code and install.

Text-Search-Engine has no build file. You will be need to create the build yourself to build the component from source.

Top functions reviewed by kandi - BETA

kandi has reviewed Text-Search-Engine and discovered the below as its top functions. This is intended to give you an instant insight into Text-Search-Engine implemented functionality, and help decide if they suit your requirements.

Perform one - word query
Rank the results of a query
Generate vectors for a list of documents
Return the frequency of the query
Performs one - word query
Compute term frequency
Computes the dot product of two documents
Generate a score for a given term
Returns a list of uniques
Compute the term vector for the query
Populate the term_frequency
The size of the collection
Returns the term frequency of a document
Returns the inverse function of the IDF function
Returns the indices of terms in the file
Returns a dictionary mapping term to word index
Construct a dictionary of indices for each term
Execute the query
Calculates the total index for each word
Free text query

Get all kandi verified functions for this library.

Text-Search-Engine Key Features

No Key Features are available at this moment for Text-Search-Engine.

Text-Search-Engine Examples and Code Snippets

No Code Snippets are available at this moment for Text-Search-Engine.

Community Discussions

Trending Discussions on Text-Search-Engine

Full-text search - should I pick dedicated search engine (SOLR, Elastic) or RDBMS one?

QUESTION

Full-text search - should I pick dedicated search engine (SOLR, Elastic) or RDBMS one?

Asked 2021-Nov-10 at 10:21

I am working on my diploma exam with topic of Full-Text Search in Apache SOLR. Within the introduction, I should elaborate what are the purpose and advantages of Apache SOLR, i.e. why would one opt for Full-Text Search engine like SOLR instead of MySQL, for instance. Using literature like "SOLR in action (2013)" one would say it's rather easy to determine when to use SOLR, ElasticSearch or something else, instead of MySQL - for that era. There is also this great question from 2010 on SO: Comparison of full text search engine - Lucene, Sphinx, Postgresql, MySQL?. Alas, as great as it was around 2010, answers now seem painfully obsolete. E.g. "MySQL MyISAM table type supports Full-Text Search, but InnoDB does not". Several years after this InnoDB also added Full-Text Search support. Now, there are some articles that manage to shed some light on this, like https://lucidworks.com/post/full-text-search-engines-vs-dbms/ which states that advantages of Full-Text Search systems are

search speed, variety of indexing and querying options, ranking and relevancy capabilities...

Yet, there are lot of other articles stating things like

MySQL Full-Text Search will now fit your needs in 80% of cases

etc, and it seems that over past 10 years MySql, MongoDB, PostgreSQL and other relational database Full-Text Search capability increased dramatically.

Yet, graph on https://db-engines.com/en/ranking_trend/system/Elasticsearch%3BMySQL%3BSolr shows that Full-Text Search engines are not losing popularity, but their usage is growing, and even SOLR that was losing pace steadily, now seems to be waking up.

So, there must be something to it? Is it that:

SOLR, Elastic, Sphinx... are still considerably faster than their relational counterparts?
there is larger variety of options, like advanced, customizable tokenization, faceting? Maybe better languages support?
relational databases can't handle well enough search on very large number of documents?

etc.

In short, what would make you take Apache SOLR or Elastic nowadays, instead of MySQL or other relational database with their increased Full-Text search capabilities? Why are Apache SOLR and Elastic Search still that popular when using them requires another stack of resources and administration if you already have data in your relational or NoSQL database?

So the central question is: If I have system that uses MySQL database for data storage, and I need to add full text search capabilities for one or several fields, to include fuzzy search (typos), synonyms, stemming, to handle relevancy and ranking in custom way, is it generally better to use MySQL FTS (so no need for another stack of resources and administration) or a dedicated full text search engine like Apache SOLR or Elastic search is significantly enough better at this?

...

ANSWER

Answered 2021-Nov-09 at 19:30

Specialized indexing solutions like Apache Solr, ElasticSearch, Sphinx Search are usually faster than the built-in fulltext indexing of MySQL or GIST of PostreSQL, etc. The specialized solutions often have more features like stemming, more sophisticated searching including faceting, and also storing extra data in a "document" associated with the indexed text.

On the other hand, using one of those complementary solutions means extra complexity to copy data into the indexing solution. How frequently do you need to update the index? Is it efficient to update the index incrementally, or do you basically need to clobber the index and create a fresh index from your whole dataset?

Whereas using the builtin indexing features of your RDBMS have the advantage that the index is probably kept in sync with the most recent data updates automatically. And the search capabilities may be good enough for your needs. Keeping the index maintenance simple and automated has a lot of positive value.

Besides, any of the solutions, even a sub-optimal one, is orders of magnitude better than the naïve approach many developers use: textcolumn LIKE '%keyword%'

what would make you take Apache SOLR or Elastic nowadays, instead of MySQL or other relational database with their increased Full-Text search capabilities?

Better performance, more sophisticated search support, and it helps to move those expensive search queries to a dedicated search engine, and lighten the load on your RDBMS.

Source https://stackoverflow.com/questions/69903348

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install Text-Search-Engine

You can download it from GitHub.
You can use Text-Search-Engine like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: