kandi background
Explore Kits

CoreNLP | Stanford CoreNLP : A Java suite of core NLP tools | Natural Language Processing library

 by   stanfordnlp Java Version: v4.4.0 License: Non-SPDX

 by   stanfordnlp Java Version: v4.4.0 License: Non-SPDX

Download this library from

kandi X-RAY | CoreNLP Summary

CoreNLP is a Java library typically used in Institutions, Learning, Administration, Public Services, Artificial Intelligence, Natural Language Processing applications. CoreNLP has no bugs, it has no vulnerabilities, it has build file available and it has medium support. However CoreNLP has a Non-SPDX License. You can download it from GitHub, Maven.
Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize and interpret dates, times, and numeric quantities, mark up the structure of sentences in terms of phrases or word dependencies, and indicate which noun phrases refer to the same entities. It was originally developed for English, but now also provides varying levels of support for (Modern Standard) Arabic, (mainland) Chinese, French, German, and Spanish. Stanford CoreNLP is an integrated framework, which makes it very easy to apply a bunch of language analysis tools to a piece of text. Starting from plain text, you can run all the tools with just two lines of code. Its analyses provide the foundational building blocks for higher-level and domain-specific text understanding applications. Stanford CoreNLP is a set of stable and well-tested natural language processing tools, widely used by various groups in academia, industry, and government. The tools variously use rule-based, probabilistic machine learning, and deep learning components. The Stanford CoreNLP code is written in Java and licensed under the GNU General Public License (v3 or later). Note that this is the full GPL, which allows many free uses, but not its use in proprietary software that you distribute to others.
Support
Support
Quality
Quality
Security
Security
License
License
Reuse
Reuse

kandi-support Support

  • CoreNLP has a medium active ecosystem.
  • It has 8424 star(s) with 2644 fork(s). There are 501 watchers for this library.
  • There were 1 major release(s) in the last 12 months.
  • There are 170 open issues and 833 have been closed. On average issues are closed in 613 days. There are 2 open pull requests and 0 closed requests.
  • It has a neutral sentiment in the developer community.
  • The latest version of CoreNLP is v4.4.0
CoreNLP Support
Best in #Natural Language Processing
Average in #Natural Language Processing
CoreNLP Support
Best in #Natural Language Processing
Average in #Natural Language Processing

quality kandi Quality

  • CoreNLP has no bugs reported.
CoreNLP Quality
Best in #Natural Language Processing
Average in #Natural Language Processing
CoreNLP Quality
Best in #Natural Language Processing
Average in #Natural Language Processing

securitySecurity

  • CoreNLP has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
CoreNLP Security
Best in #Natural Language Processing
Average in #Natural Language Processing
CoreNLP Security
Best in #Natural Language Processing
Average in #Natural Language Processing

license License

  • CoreNLP has a Non-SPDX License.
  • Non-SPDX licenses can be open source with a non SPDX compliant license, or non open source licenses, and you need to review them closely before use.
CoreNLP License
Best in #Natural Language Processing
Average in #Natural Language Processing
CoreNLP License
Best in #Natural Language Processing
Average in #Natural Language Processing

buildReuse

  • CoreNLP releases are available to install and integrate.
  • Deployable package is available in Maven.
  • Build file is available. You can build the component from source.
  • Installation instructions, examples and code snippets are available.
CoreNLP Reuse
Best in #Natural Language Processing
Average in #Natural Language Processing
CoreNLP Reuse
Best in #Natural Language Processing
Average in #Natural Language Processing
Top functions reviewed by kandi - BETA

kandi has reviewed CoreNLP and discovered the below as its top functions. This is intended to give you an instant insight into CoreNLP implemented functionality, and help decide if they suit your requirements.

  • Get the next input .
    • Sets the option flag on the command line .
      • Extract the RVFDatum .
        • Makes a message pass through the message passing .
          • Helper function to check whether boundaries are inside bounds .
            • Performs inside the chart .
              • Convert the trees from the command line arguments .
                • Internal method used to print tree .
                  • Gets the coreferent .
                    • Initialize the environment .

                      Get all kandi verified functions for this library.

                      Get all kandi verified functions for this library.

                      CoreNLP Key Features

                      Stanford CoreNLP: A Java suite of core NLP tools.

                      Build Instructions

                      copy iconCopydownload iconDownload
                      # Make sure you have git-lfs installed
                      # (https://git-lfs.github.com/)
                      git lfs install
                      
                      git clone https://huggingface.co/stanfordnlp/corenlp-french
                      

                      Stanford CoreNLP - Unknown variable WORKDAY

                      copy iconCopydownload iconDownload
                      mkdir -p edu/stanford/nlp/models/sutime
                      cp sutime/english.sutime.txt edu/stanford/nlp/models/sutime
                      jar -uf stanford-corenlp-4.2.0-models.jar edu/stanford/nlp/models/sutime/english.sutime.txt
                      rm -rf edu
                      

                      How can I iterate token attributes with coreference results in CoreNLP?

                      copy iconCopydownload iconDownload
                      from collections import defaultdict
                      from stanza.server import CoreNLPClient
                      
                      client = CoreNLPClient(
                          annotators=['tokenize','ssplit', 'pos', 'lemma', 'ner', 'coref'],
                          be_quiet=False)
                      
                      text = "Barack Obama was born in Hawaii.  In 2008 he became the president."
                      
                      doc = client.annotate(text)
                      
                      animacy = defaultdict(dict)
                      for x in doc.corefChain:
                          for y in x.mention:
                              print(y.animacy)
                              for i in range(y.beginIndex, y.endIndex):
                                  animacy[y.sentenceIndex][i] = True
                                  print(y.sentenceIndex, i)
                      
                      for sent_idx, sent in enumerate(doc.sentence):
                          print("[Sentence {}]".format(sent_idx+1))
                          for t_idx, token in enumerate(sent.token):
                              animate = animacy[sent_idx].get(t_idx, False)
                              print("{:12s}\t{:12s}\t{:6s}\t{:20s}\t{}".format(token.word, token.lemma, token.pos, token.ner, animate))
                          print("")
                      

                      Stanford CoreNLP: Java can't find or load main class / java.ClassNotFoundException

                      copy iconCopydownload iconDownload
                      java -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -file input.txt
                      

                      Issue in creating Semgrex patterns with relation names containing ":" colon

                      copy iconCopydownload iconDownload
                      {} <<obl:from/  {} 
                      

                      Access server running on docker container

                      copy iconCopydownload iconDownload
                      docker inspect keyphrase-extraction | grep IPAddress
                      
                      docker run -p9000:9000 -v /home/goorulabs/NarrativeArcse-extraction/sent2vec/torontobooks_unigrams.bin:/sent2vec/pretrained_model.bin -it keyphrase-extraction    
                      
                      telnet 172.17.0.1 9000
                      
                      docker inspect keyphrase-extraction | grep IPAddress
                      
                      docker run -p9000:9000 -v /home/goorulabs/NarrativeArcse-extraction/sent2vec/torontobooks_unigrams.bin:/sent2vec/pretrained_model.bin -it keyphrase-extraction    
                      
                      telnet 172.17.0.1 9000
                      
                      docker inspect keyphrase-extraction | grep IPAddress
                      
                      docker run -p9000:9000 -v /home/goorulabs/NarrativeArcse-extraction/sent2vec/torontobooks_unigrams.bin:/sent2vec/pretrained_model.bin -it keyphrase-extraction    
                      
                      telnet 172.17.0.1 9000
                      

                      Pattern matching with tregex in Stanzas Corenlp implementation doesn't seem to finde the right subtrees

                      copy iconCopydownload iconDownload
                      (ROOT
                        (NUR
                          (CS
                            (S (PROPN Peter) (VERB kommt))
                            (CCONJ und)
                            (S (PROPN Paul) (VERB geht)))))
                      

                      CMake and make looking for libjawt.so file in the wrong place

                      copy iconCopydownload iconDownload
                      find_package(JNI)
                      
                      cmake -LA
                      
                      cmake -DJAVA_AWT_LIBRARY=/usr/lib/jvm/default/lib/libjawt.so ..
                      
                      find_package(JNI)
                      
                      cmake -LA
                      
                      cmake -DJAVA_AWT_LIBRARY=/usr/lib/jvm/default/lib/libjawt.so ..
                      
                      find_package(JNI)
                      
                      cmake -LA
                      
                      cmake -DJAVA_AWT_LIBRARY=/usr/lib/jvm/default/lib/libjawt.so ..
                      

                      Spring Boot Multi Module Gradle Project classpath problem: Package Not Found, Symbol not Found

                      copy iconCopydownload iconDownload
                      ...
                      
                      bootJar {
                          enabled = false
                      }
                      
                      jar {
                          enabled = true
                      }
                      
                      ...
                      
                      

                      Add custom rules for parsing quarters to SUTime

                      copy iconCopydownload iconDownload
                      props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner");
                      props.setProperty("ner.docDate.usePresent", "true");
                      // this will shut off the statistical models if you only want to run SUTime only
                      props.setProperty("ner.rulesOnly", "true");
                      // add your sutime properties as in your example
                      ...
                      StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
                      

                      Extract Noun Phrases with Stanza and CoreNLPClient

                      copy iconCopydownload iconDownload
                      from stanza.server import CoreNLPClient
                      
                      # get noun phrases with tregex
                      def noun_phrases(_client, _text, _annotators=None):
                          pattern = 'NP'
                          matches = _client.tregex(_text,pattern,annotators=_annotators)
                          print("\n".join(["\t"+sentence[match_id]['spanString'] for sentence in matches['sentences'] for match_id in sentence]))
                      
                      # English example
                      with CoreNLPClient(timeout=30000, memory='16G') as client:
                          englishText = "Albert Einstein was a German-born theoretical physicist. He developed the theory of relativity."
                          print('---')
                          print(englishText)
                          noun_phrases(client,englishText,_annotators="tokenize,ssplit,pos,lemma,parse")
                      
                      # French example
                      with CoreNLPClient(properties='french', timeout=30000, memory='16G') as client:
                          frenchText = "Je suis John."
                          print('---')
                          print(frenchText)
                          noun_phrases(client,frenchText,_annotators="tokenize,ssplit,mwt,pos,lemma,parse")
                      

                      Community Discussions

                      Trending Discussions on CoreNLP
                      • Stanford CoreNLP - Unknown variable WORKDAY
                      • NLP Pipeline, DKPro, Ruta - Missing Descriptor Error
                      • Meaning of output/training status of 256 in Stanford NLP NER?
                      • Is there a way to use external libraries in IntelliJ without downloading their .jars?
                      • Stanford Core NLP Tree Parser Sentence Limits wrong - suggestions?
                      • How can I iterate token attributes with coreference results in CoreNLP?
                      • Stanford CoreNLP: Java can't find or load main class / java.ClassNotFoundException
                      • Issue in creating Semgrex patterns with relation names containing &quot;:&quot; colon
                      • Access server running on docker container
                      • Pattern matching with tregex in Stanzas Corenlp implementation doesn't seem to finde the right subtrees
                      Trending Discussions on CoreNLP

                      QUESTION

                      Stanford CoreNLP - Unknown variable WORKDAY

                      Asked 2021-Nov-20 at 19:28

                      I am processing some documents and I am getting many WORKDAY messages as seen below. There's a similar issue posted here for WEEKDAY. Does anyone know how to deal with this message. I am running corenlp in a Java server on Windows and accessing it using Juypyter Notebook and Python code.

                      [pool-2-thread-2] INFO edu.stanford.nlp.ling.tokensregex.types.Expressions - Unknown variable: WORKDAY
                      [pool-2-thread-2] INFO edu.stanford.nlp.ling.tokensregex.types.Expressions - Unknown variable: WORKDAY
                      [pool-2-thread-2] INFO edu.stanford.nlp.ling.tokensregex.types.Expressions - Unknown variable: WORKDAY
                      [pool-1-thread-7] WARN CoreNLP - java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error making document
                      

                      ANSWER

                      Answered 2021-Nov-20 at 19:28

                      This is an error in the current SUTime rules file (and it's actually been there for quite a few versions). If you want to fix it immediately, you can do the following. Or we'll fix it in the next release. These are Unix commands, but the same thing will work elsewhere except for how you refer to and create folders.

                      Find this line in sutime/english.sutime.txt and delete it. Save the file.

                      { (/workday|work day|business hours/) => WORKDAY }

                      Then move the file to the right location for replacing in the jar file, and then replace it in the jar file. In the root directory of the CoreNLP distribution do the following (assuming you don't already have an edu file/folder in that directory):

                      mkdir -p edu/stanford/nlp/models/sutime
                      cp sutime/english.sutime.txt edu/stanford/nlp/models/sutime
                      jar -uf stanford-corenlp-4.2.0-models.jar edu/stanford/nlp/models/sutime/english.sutime.txt
                      rm -rf edu
                      

                      Source https://stackoverflow.com/questions/69955279

                      Community Discussions, Code Snippets contain sources that include Stack Exchange Network

                      Vulnerabilities

                      No vulnerabilities reported

                      Install CoreNLP

                      Several times a year we distribute a new version of the software, which corresponds to a stable commit. During the time between releases, one can always use the latest, under development version of our code.
                      Make sure you have Ant installed, details here: http://ant.apache.org/
                      Compile the code with this command: cd CoreNLP ; ant
                      Then run this command to build a jar with the latest version of the code: cd CoreNLP/classes ; jar -cf ../stanford-corenlp.jar edu
                      This will create a new jar called stanford-corenlp.jar in the CoreNLP folder which contains the latest code
                      The dependencies that work with the latest code are in CoreNLP/lib and CoreNLP/liblocal, so make sure to include those in your CLASSPATH.
                      When using the latest version of the code make sure to download the latest versions of the corenlp-models, english-models, and english-models-kbp and include them in your CLASSPATH. If you are processing languages other than English, make sure to download the latest version of the models jar for the language you are interested in.
                      Make sure you have Maven installed, details here: https://maven.apache.org/
                      If you run this command in the CoreNLP directory: mvn package , it should run the tests and build this jar file: CoreNLP/target/stanford-corenlp-4.4.0.jar
                      When using the latest version of the code make sure to download the latest versions of the corenlp-models, english-extra-models, and english-kbp-models and include them in your CLASSPATH. If you are processing languages other than English, make sure to download the latest version of the models jar for the language you are interested in.
                      If you want to use Stanford CoreNLP as part of a Maven project you need to install the models jars into your Maven repository. Below is a sample command for installing the Spanish models jar. For other languages just change the language name in the command. To install stanford-corenlp-models-current.jar you will need to set -Dclassifier=models. Here is the sample command for Spanish: mvn install:install-file -Dfile=/location/of/stanford-spanish-corenlp-models-current.jar -DgroupId=edu.stanford.nlp -DartifactId=stanford-corenlp -Dversion=4.4.0 -Dclassifier=models-spanish -Dpackaging=jar

                      Support

                      For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

                      DOWNLOAD this Library from

                      Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from
                      over 430 million Knowledge Items
                      Find more libraries
                      Reuse Solution Kits and Libraries Curated by Popular Use Cases
                      Explore Kits

                      Save this library and start creating your kit

                      Share this Page

                      share link
                      Consider Popular Natural Language Processing Libraries
                      Try Top Libraries by stanfordnlp
                      Compare Natural Language Processing Libraries with Highest Support
                      Compare Natural Language Processing Libraries with Highest Quality
                      Compare Natural Language Processing Libraries with Highest Security
                      Compare Natural Language Processing Libraries with Permissive License
                      Compare Natural Language Processing Libraries with Highest Reuse
                      Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from
                      over 430 million Knowledge Items
                      Find more libraries
                      Reuse Solution Kits and Libraries Curated by Popular Use Cases
                      Explore Kits

                      Save this library and start creating your kit

                      • © 2022 Open Weaver Inc.