acl2017_document_clustering | Determining Gains Acquired from Word Embedding

by bobye Python Version: Current License: No License

X-Ray Key Features Code Snippets Community Discussions Vulnerabilities Install Support

kandi X-RAY | acl2017_document_clustering Summary

acl2017_document_clustering is a Python library. acl2017_document_clustering has no bugs, it has no vulnerabilities and it has low support. However acl2017_document_clustering build file is not available. You can download it from GitHub.

code for "Determining Gains Acquired from Word Embedding Quantitatively Using Discrete Distribution Clustering" ACL 2017

Support

Quality

Security

License

Reuse

Support

acl2017_document_clustering has a low active ecosystem.

It has 19 star(s) with 5 fork(s). There are 2 watchers for this library.

It had no major release in the last 6 months.

There are 1 open issues and 0 have been closed. On average issues are closed in 740 days. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of acl2017_document_clustering is current.

Quality

acl2017_document_clustering has 0 bugs and 0 code smells.

Security

acl2017_document_clustering has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

acl2017_document_clustering code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

acl2017_document_clustering does not have a standard license declared.

Check the repository for any license declaration and review the terms closely.

Without a license, all rights are reserved, and you cannot use the library in your applications.

Reuse

acl2017_document_clustering releases are not available. You will need to build from source code and install.

acl2017_document_clustering has no build file. You will be need to create the build yourself to build the component from source.

Installation instructions, examples and code snippets are available.

acl2017_document_clustering saves you 105 person hours of effort in developing the same functionality from scratch.

It has 267 lines of code, 12 functions and 3 files.

It has low code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of acl2017_document_clustering

Get all kandi verified functions for this library.

acl2017_document_clustering Key Features

No Key Features are available at this moment for acl2017_document_clustering.

acl2017_document_clustering Examples and Code Snippets

No Code Snippets are available at this moment for acl2017_document_clustering.

Community Discussions

No Community Discussions are available at this moment for acl2017_document_clustering.Refer to stack overflow page for discussions.

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install acl2017_document_clustering

Download sample datasets from the author's webpage. Download pre-trained wordvecs, two of which are public downloadable. Install python (version 2.7) and its dependencies. The tested versions are. You may need adapt the code to newer versions if needed. After you configure the python environment properly, you can start from a sample dataset, say story_cluster.txt, and a wordvec model, say glove_6B_300d.bin. The following command create d2s formated data from story_cluster.txt. Edit the source for adapting to other datasets. It creates two files: story_cluster.d2s and story_cluster.d2s.vocab0. At this point, you need to request a patent protected C/MPI software called d2_kmeans. The software will take these two files are input and output clustering labels as a file named story_cluster.d2s_[xxxxxx].label_o in the same directory. Type the same command again to evaluate the result that was reported in the paper. The MIT License (MIT). Copyright (c) 2017 Jianbo Ye.
glove_6B_300d.bin
GoogleNews-vectors-negative300.bin
word2vec_400_10_10.bin
numpy (1.9.2)
scipy (1.9.2)
sklearn (0.16.1)
cvxopt (1.1.7)
gensim (0.12.1)
nltk (3.0.5)
mosek (optional, 7.x)

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: