kandi background
Explore Kits

CoreNLP | Stanford CoreNLP: A Java suite of core NLP tools. | Natural Language Processing library

 by   stanfordnlp Java Version: v4.4.0 License: Non-SPDX

 by   stanfordnlp Java Version: v4.4.0 License: Non-SPDX

Download this library from

kandi X-RAY | CoreNLP Summary

CoreNLP is a Java library typically used in Institutions, Learning, Administration, Public Services, Artificial Intelligence, Natural Language Processing applications. CoreNLP has no bugs, it has no vulnerabilities, it has build file available and it has medium support. However CoreNLP has a Non-SPDX License. You can download it from GitHub, Maven.
Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize and interpret dates, times, and numeric quantities, mark up the structure of sentences in terms of phrases or word dependencies, and indicate which noun phrases refer to the same entities. It was originally developed for English, but now also provides varying levels of support for (Modern Standard) Arabic, (mainland) Chinese, French, German, and Spanish. Stanford CoreNLP is an integrated framework, which makes it very easy to apply a bunch of language analysis tools to a piece of text. Starting from plain text, you can run all the tools with just two lines of code. Its analyses provide the foundational building blocks for higher-level and domain-specific text understanding applications. Stanford CoreNLP is a set of stable and well-tested natural language processing tools, widely used by various groups in academia, industry, and government. The tools variously use rule-based, probabilistic machine learning, and deep learning components. The Stanford CoreNLP code is written in Java and licensed under the GNU General Public License (v3 or later). Note that this is the full GPL, which allows many free uses, but not its use in proprietary software that you distribute to others.
Support
Support
Quality
Quality
Security
Security
License
License
Reuse
Reuse

kandi-support Support

  • CoreNLP has a medium active ecosystem.
  • It has 8424 star(s) with 2644 fork(s). There are 501 watchers for this library.
  • There were 4 major release(s) in the last 12 months.
  • There are 170 open issues and 833 have been closed. On average issues are closed in 613 days. There are 2 open pull requests and 0 closed requests.
  • It has a neutral sentiment in the developer community.
  • The latest version of CoreNLP is v4.4.0
CoreNLP Support
Best in #Natural Language Processing
Average in #Natural Language Processing
CoreNLP Support
Best in #Natural Language Processing
Average in #Natural Language Processing

quality kandi Quality

  • CoreNLP has no bugs reported.
CoreNLP Quality
Best in #Natural Language Processing
Average in #Natural Language Processing
CoreNLP Quality
Best in #Natural Language Processing
Average in #Natural Language Processing

securitySecurity

  • CoreNLP has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
CoreNLP Security
Best in #Natural Language Processing
Average in #Natural Language Processing
CoreNLP Security
Best in #Natural Language Processing
Average in #Natural Language Processing

license License

  • CoreNLP has a Non-SPDX License.
  • Non-SPDX licenses can be open source with a non SPDX compliant license, or non open source licenses, and you need to review them closely before use.
CoreNLP License
Best in #Natural Language Processing
Average in #Natural Language Processing
CoreNLP License
Best in #Natural Language Processing
Average in #Natural Language Processing

buildReuse

  • CoreNLP releases are available to install and integrate.
  • Deployable package is available in Maven.
  • Build file is available. You can build the component from source.
  • Installation instructions, examples and code snippets are available.
CoreNLP Reuse
Best in #Natural Language Processing
Average in #Natural Language Processing
CoreNLP Reuse
Best in #Natural Language Processing
Average in #Natural Language Processing
Top functions reviewed by kandi - BETA

kandi has reviewed CoreNLP and discovered the below as its top functions. This is intended to give you an instant insight into CoreNLP implemented functionality, and help decide if they suit your requirements.

  • Get the next input .
  • Sets the option flag on the command line .
  • Extract the RVFDatum .
  • Makes a message pass through the message passing .
  • Helper function to check whether boundaries are inside bounds .
  • Performs inside the chart .
  • Convert the trees from the command line arguments .
  • Internal method used to print tree .
  • Gets the coreferent .
  • Initialize the environment .

CoreNLP Key Features

Stanford CoreNLP: A Java suite of core NLP tools.

Build Instructions

copy iconCopydownload iconDownload
# Make sure you have git-lfs installed
# (https://git-lfs.github.com/)
git lfs install

git clone https://huggingface.co/stanfordnlp/corenlp-french

Stanford CoreNLP - Unknown variable WORKDAY

copy iconCopydownload iconDownload
mkdir -p edu/stanford/nlp/models/sutime
cp sutime/english.sutime.txt edu/stanford/nlp/models/sutime
jar -uf stanford-corenlp-4.2.0-models.jar edu/stanford/nlp/models/sutime/english.sutime.txt
rm -rf edu

How can I iterate token attributes with coreference results in CoreNLP?

copy iconCopydownload iconDownload
from collections import defaultdict
from stanza.server import CoreNLPClient

client = CoreNLPClient(
    annotators=['tokenize','ssplit', 'pos', 'lemma', 'ner', 'coref'],
    be_quiet=False)

text = "Barack Obama was born in Hawaii.  In 2008 he became the president."

doc = client.annotate(text)

animacy = defaultdict(dict)
for x in doc.corefChain:
    for y in x.mention:
        print(y.animacy)
        for i in range(y.beginIndex, y.endIndex):
            animacy[y.sentenceIndex][i] = True
            print(y.sentenceIndex, i)

for sent_idx, sent in enumerate(doc.sentence):
    print("[Sentence {}]".format(sent_idx+1))
    for t_idx, token in enumerate(sent.token):
        animate = animacy[sent_idx].get(t_idx, False)
        print("{:12s}\t{:12s}\t{:6s}\t{:20s}\t{}".format(token.word, token.lemma, token.pos, token.ner, animate))
    print("")

Stanford CoreNLP: Java can't find or load main class / java.ClassNotFoundException

copy iconCopydownload iconDownload
java -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -file input.txt

Issue in creating Semgrex patterns with relation names containing ":" colon

copy iconCopydownload iconDownload
{} <<obl:from/  {} 

Access server running on docker container

copy iconCopydownload iconDownload
docker inspect keyphrase-extraction | grep IPAddress
docker run -p9000:9000 -v /home/goorulabs/NarrativeArcse-extraction/sent2vec/torontobooks_unigrams.bin:/sent2vec/pretrained_model.bin -it keyphrase-extraction    
telnet 172.17.0.1 9000
-----------------------
docker inspect keyphrase-extraction | grep IPAddress
docker run -p9000:9000 -v /home/goorulabs/NarrativeArcse-extraction/sent2vec/torontobooks_unigrams.bin:/sent2vec/pretrained_model.bin -it keyphrase-extraction    
telnet 172.17.0.1 9000
-----------------------
docker inspect keyphrase-extraction | grep IPAddress
docker run -p9000:9000 -v /home/goorulabs/NarrativeArcse-extraction/sent2vec/torontobooks_unigrams.bin:/sent2vec/pretrained_model.bin -it keyphrase-extraction    
telnet 172.17.0.1 9000

Pattern matching with tregex in Stanzas Corenlp implementation doesn't seem to finde the right subtrees

copy iconCopydownload iconDownload
(ROOT
  (NUR
    (CS
      (S (PROPN Peter) (VERB kommt))
      (CCONJ und)
      (S (PROPN Paul) (VERB geht)))))

CMake and make looking for libjawt.so file in the wrong place

copy iconCopydownload iconDownload
find_package(JNI)
cmake -LA
cmake -DJAVA_AWT_LIBRARY=/usr/lib/jvm/default/lib/libjawt.so ..
-----------------------
find_package(JNI)
cmake -LA
cmake -DJAVA_AWT_LIBRARY=/usr/lib/jvm/default/lib/libjawt.so ..
-----------------------
find_package(JNI)
cmake -LA
cmake -DJAVA_AWT_LIBRARY=/usr/lib/jvm/default/lib/libjawt.so ..

Spring Boot Multi Module Gradle Project classpath problem: Package Not Found, Symbol not Found

copy iconCopydownload iconDownload
...

bootJar {
    enabled = false
}

jar {
    enabled = true
}

...

Add custom rules for parsing quarters to SUTime

copy iconCopydownload iconDownload
props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner");
props.setProperty("ner.docDate.usePresent", "true");
// this will shut off the statistical models if you only want to run SUTime only
props.setProperty("ner.rulesOnly", "true");
// add your sutime properties as in your example
...
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

Extract Noun Phrases with Stanza and CoreNLPClient

copy iconCopydownload iconDownload
from stanza.server import CoreNLPClient

# get noun phrases with tregex
def noun_phrases(_client, _text, _annotators=None):
    pattern = 'NP'
    matches = _client.tregex(_text,pattern,annotators=_annotators)
    print("\n".join(["\t"+sentence[match_id]['spanString'] for sentence in matches['sentences'] for match_id in sentence]))

# English example
with CoreNLPClient(timeout=30000, memory='16G') as client:
    englishText = "Albert Einstein was a German-born theoretical physicist. He developed the theory of relativity."
    print('---')
    print(englishText)
    noun_phrases(client,englishText,_annotators="tokenize,ssplit,pos,lemma,parse")

# French example
with CoreNLPClient(properties='french', timeout=30000, memory='16G') as client:
    frenchText = "Je suis John."
    print('---')
    print(frenchText)
    noun_phrases(client,frenchText,_annotators="tokenize,ssplit,mwt,pos,lemma,parse")

Community Discussions

Trending Discussions on CoreNLP
  • Stanford CoreNLP - Unknown variable WORKDAY
  • NLP Pipeline, DKPro, Ruta - Missing Descriptor Error
  • Meaning of output/training status of 256 in Stanford NLP NER?
  • Is there a way to use external libraries in IntelliJ without downloading their .jars?
  • Stanford Core NLP Tree Parser Sentence Limits wrong - suggestions?
  • How can I iterate token attributes with coreference results in CoreNLP?
  • Stanford CoreNLP: Java can't find or load main class / java.ClassNotFoundException
  • Issue in creating Semgrex patterns with relation names containing &quot;:&quot; colon
  • Access server running on docker container
  • Pattern matching with tregex in Stanzas Corenlp implementation doesn't seem to finde the right subtrees
Trending Discussions on CoreNLP

QUESTION

Stanford CoreNLP - Unknown variable WORKDAY

Asked 2021-Nov-20 at 19:28

I am processing some documents and I am getting many WORKDAY messages as seen below. There's a similar issue posted here for WEEKDAY. Does anyone know how to deal with this message. I am running corenlp in a Java server on Windows and accessing it using Juypyter Notebook and Python code.

[pool-2-thread-2] INFO edu.stanford.nlp.ling.tokensregex.types.Expressions - Unknown variable: WORKDAY
[pool-2-thread-2] INFO edu.stanford.nlp.ling.tokensregex.types.Expressions - Unknown variable: WORKDAY
[pool-2-thread-2] INFO edu.stanford.nlp.ling.tokensregex.types.Expressions - Unknown variable: WORKDAY
[pool-1-thread-7] WARN CoreNLP - java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error making document

ANSWER

Answered 2021-Nov-20 at 19:28

This is an error in the current SUTime rules file (and it's actually been there for quite a few versions). If you want to fix it immediately, you can do the following. Or we'll fix it in the next release. These are Unix commands, but the same thing will work elsewhere except for how you refer to and create folders.

Find this line in sutime/english.sutime.txt and delete it. Save the file.

{ (/workday|work day|business hours/) => WORKDAY }

Then move the file to the right location for replacing in the jar file, and then replace it in the jar file. In the root directory of the CoreNLP distribution do the following (assuming you don't already have an edu file/folder in that directory):

mkdir -p edu/stanford/nlp/models/sutime
cp sutime/english.sutime.txt edu/stanford/nlp/models/sutime
jar -uf stanford-corenlp-4.2.0-models.jar edu/stanford/nlp/models/sutime/english.sutime.txt
rm -rf edu

Source https://stackoverflow.com/questions/69955279

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install CoreNLP

Several times a year we distribute a new version of the software, which corresponds to a stable commit. During the time between releases, one can always use the latest, under development version of our code.
Make sure you have Ant installed, details here: http://ant.apache.org/
Compile the code with this command: cd CoreNLP ; ant
Then run this command to build a jar with the latest version of the code: cd CoreNLP/classes ; jar -cf ../stanford-corenlp.jar edu
This will create a new jar called stanford-corenlp.jar in the CoreNLP folder which contains the latest code
The dependencies that work with the latest code are in CoreNLP/lib and CoreNLP/liblocal, so make sure to include those in your CLASSPATH.
When using the latest version of the code make sure to download the latest versions of the corenlp-models, english-models, and english-models-kbp and include them in your CLASSPATH. If you are processing languages other than English, make sure to download the latest version of the models jar for the language you are interested in.
Make sure you have Maven installed, details here: https://maven.apache.org/
If you run this command in the CoreNLP directory: mvn package , it should run the tests and build this jar file: CoreNLP/target/stanford-corenlp-4.4.0.jar
When using the latest version of the code make sure to download the latest versions of the corenlp-models, english-extra-models, and english-kbp-models and include them in your CLASSPATH. If you are processing languages other than English, make sure to download the latest version of the models jar for the language you are interested in.
If you want to use Stanford CoreNLP as part of a Maven project you need to install the models jars into your Maven repository. Below is a sample command for installing the Spanish models jar. For other languages just change the language name in the command. To install stanford-corenlp-models-current.jar you will need to set -Dclassifier=models. Here is the sample command for Spanish: mvn install:install-file -Dfile=/location/of/stanford-spanish-corenlp-models-current.jar -DgroupId=edu.stanford.nlp -DartifactId=stanford-corenlp -Dversion=4.4.0 -Dclassifier=models-spanish -Dpackaging=jar

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

DOWNLOAD this Library from

Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from
over 430 million Knowledge Items
Find more libraries
Reuse Solution Kits and Libraries Curated by Popular Use Cases

Save this library and start creating your kit

Share this Page

share link
Consider Popular Natural Language Processing Libraries
Compare Natural Language Processing Libraries with Highest Support
Compare Natural Language Processing Libraries with Highest Quality
Compare Natural Language Processing Libraries with Highest Security
Compare Natural Language Processing Libraries with Permissive License
Compare Natural Language Processing Libraries with Highest Reuse
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from
over 430 million Knowledge Items
Find more libraries
Reuse Solution Kits and Libraries Curated by Popular Use Cases

Save this library and start creating your kit

  • © 2022 Open Weaver Inc.