jcrfsuite | Java interface for CRFsuite http | Natural Language Processing library

 by   vinhkhuc Java Version: 0.6.1 License: Non-SPDX

kandi X-RAY | jcrfsuite Summary

kandi X-RAY | jcrfsuite Summary

jcrfsuite is a Java library typically used in Artificial Intelligence, Natural Language Processing applications. jcrfsuite has no bugs, it has no vulnerabilities, it has build file available and it has low support. However jcrfsuite has a Non-SPDX License. You can download it from GitHub, Maven.

This is a Java interface for crfsuite, a fast implementation of Conditional Random Fields, using SWIG and class injection technique (the same technique used in snappy-java). Jcrfsuite provides API for loading trained model into memory and do sequential tagging in memory. Model training is done via command line interface. The library is designed for building Java applications for fast text sequential tagging such as Part-Of-Speech (POS) tagging, phrase chunking, Named-Entity Recognition (NER), etc. Jcrfsuite can be dropped into any Java web applications and run without problem with JVM's class loader.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              jcrfsuite has a low active ecosystem.
              It has 43 star(s) with 29 fork(s). There are 5 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              There are 9 open issues and 5 have been closed. On average issues are closed in 7 days. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of jcrfsuite is 0.6.1

            kandi-Quality Quality

              jcrfsuite has 0 bugs and 0 code smells.

            kandi-Security Security

              jcrfsuite has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              jcrfsuite code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              jcrfsuite has a Non-SPDX License.
              Non-SPDX licenses can be open source with a non SPDX compliant license, or non open source licenses, and you need to review them closely before use.

            kandi-Reuse Reuse

              jcrfsuite releases are available to install and integrate.
              Deployable package is available in Maven.
              Build file is available. You can build the component from source.
              Installation instructions are not available. Examples and code snippets are available.
              jcrfsuite saves you 753 person hours of effort in developing the same functionality from scratch.
              It has 1735 lines of code, 227 functions and 24 files.
              It has high code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed jcrfsuite and discovered the below as its top functions. This is intended to give you an instant insight into jcrfsuite implemented functionality, and help decide if they suit your requirements.
            • Loads CrfSuiteNativeNative and creates a CrfSuiteNativeNative implementation
            • Computes the MD5 hash of the given input stream
            • Extract a native library file to the target folder
            • Injects CrfSuiteNativeLoader class loader
            • Finds the native library
            • Returns the version of the crfsuite
            • Returns the byte code from classpath
            • Loads native library
            • Checks if native library is already loaded
            • Main train method
            • Train CRF Suite with an item sequence
            • Load data in CRF Suite format
            • Prints the path of the library
            • Translates an operating name to a folder name
            • Tags a model
            • Tag an item sequence
            • Gets a list of possible labels for this tagger
            • Load system properties
            Get all kandi verified functions for this library.

            jcrfsuite Key Features

            No Key Features are available at this moment for jcrfsuite.

            jcrfsuite Examples and Code Snippets

            How to use
            Javadot img1Lines of Code : 12dot img1License : Non-SPDX (NOASSERTION)
            copy iconCopy
            import com.github.jcrfsuite.CrfTrainer;
            ...
            String trainFile = "data/tweet-pos/train-oct27.txt";
            String modelFile = "twitter-pos.model";
            CrfTrainer.train(trainFile, modelFile);
            
            import com.github.jcrfsuite.CrfTagger;
            import com.github.jcrfsuite.util.  
            Maven dependency
            Javadot img2Lines of Code : 5dot img2License : Non-SPDX (NOASSERTION)
            copy iconCopy
            
              com.github.vinhkhuc
              jcrfsuite
              0.6.1
            
              
            Building
            Javadot img3Lines of Code : 3dot img3License : Non-SPDX (NOASSERTION)
            copy iconCopy
            git clone https://github.com/vinhkhuc/jcrfsuite
            cd jcrfsuite
            mvn clean package
              

            Community Discussions

            Trending Discussions on jcrfsuite

            QUESTION

            jcrfsuite training file format
            Asked 2018-Nov-07 at 00:49

            From what I understand from the example of POS Tagging given in the examples of jcrfsuite. The training file is tab separated and first token is the label. But I do not get the BigCluster| thing. Can somebody help me with how to specify tokens in training file.

            Example below:

            O BigCluster|00 BigCluster|0000 BigCluster|000000 BigCluster|00000000 BigCluster|0000000000 BigCluster|000000000000 BigCluster|00000000000000 BigCluster|0000000000000000 NextBigCluster|0100 NextBigCluster|01000101 NextBigCluster|010001011111 POSTagDict|D POSTagDict|N POSTagDict|^ POSTagDict|$ POSTagDict|G NextPOSTag|V 1gramSuff|i 1gramPref|i prevword| prevcurr||i nextword|predict nextword|predict currnext|i|predict Word|I Lower|i Xxdshape|X charclass|1, first-shortcap prevnext||predict t=0

            Test file format:

            ! BigCluster|01 BigCluster|0110 BigCluster|011011 BigCluster|01101100 BigCluster|0110110011 BigCluster|011011001100 BigCluster|01101100110000 BigCluster|0110110011000000 NextBigCluster|1000 NextBigCluster|10001000 NextBigCluster|100010000000 POSTagDict|V NextPOSTag|, metaph_POSDict|N 1gramSuff|n 2gramSuff|nn 3gramSuff|mnn 4gramSuff|mmnn 5gramSuff|mmmnn 6gramSuff|ammmnn 7gramSuff|aammmnn 8gramSuff|aaammmnn 9gramSuff|daaammmnn 1gramPref|d 2gramPref|da 3gramPref|daa 4gramPref|daaa 5gramPref|daaam 6gramPref|daaamm 7gramPref|daaammm 8gramPref|daaammmn 9gramPref|daaammmnn prevword| prevcurr||daaammmnn nextword|. nextword|. currnext|daaammmnn|. Word|Daaammmnn Lower|daaammmnn Xxdshape|Xxxxxxxxx charclass|1,2,2,2,2,2,2,2,2, first-initcap prevnext||. t=0

            ...

            ANSWER

            Answered 2017-Jun-05 at 12:48

            What is specified after the label is a list of feature-name and feature-value. It is in a sparse representation instead of tabular representation.

            BigCluster is just one of the features and it's relevant to the specific example only. You should create your own features if you are training from scratch.

            Source https://stackoverflow.com/questions/44044721

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install jcrfsuite

            You can download it from GitHub, Maven.
            You can use jcrfsuite like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the jcrfsuite component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            Install
            Maven
            Gradle
            CLONE
          • HTTPS

            https://github.com/vinhkhuc/jcrfsuite.git

          • CLI

            gh repo clone vinhkhuc/jcrfsuite

          • sshUrl

            git@github.com:vinhkhuc/jcrfsuite.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Consider Popular Natural Language Processing Libraries

            transformers

            by huggingface

            funNLP

            by fighting41love

            bert

            by google-research

            jieba

            by fxsjy

            Python

            by geekcomputers

            Try Top Libraries by vinhkhuc

            MemN2N-babi-python

            by vinhkhucPython

            PyTorch-Mini-Tutorials

            by vinhkhucPython

            JFastText

            by vinhkhucJava

            VanillaML

            by vinhkhucPython

            lbfgs4j

            by vinhkhucJava