CoreNLP | Stanford CoreNLP: A Java suite of core NLP tools | Natural Language Processing library

by stanfordnlp Java Version: v4.5.4 License: GPL-3.0

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | CoreNLP Summary

CoreNLP is a Java library typically used in Institutions, Learning, Administration, Public Services, Artificial Intelligence, Natural Language Processing applications. CoreNLP has no bugs, it has no vulnerabilities, it has build file available, it has a Strong Copyleft License and it has medium support. You can download it from GitHub, Maven.

Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize and interpret dates, times, and numeric quantities, mark up the structure of sentences in terms of phrases or word dependencies, and indicate which noun phrases refer to the same entities. It was originally developed for English, but now also provides varying levels of support for (Modern Standard) Arabic, (mainland) Chinese, French, German, and Spanish. Stanford CoreNLP is an integrated framework, which makes it very easy to apply a bunch of language analysis tools to a piece of text. Starting from plain text, you can run all the tools with just two lines of code. Its analyses provide the foundational building blocks for higher-level and domain-specific text understanding applications. Stanford CoreNLP is a set of stable and well-tested natural language processing tools, widely used by various groups in academia, industry, and government. The tools variously use rule-based, probabilistic machine learning, and deep learning components. The Stanford CoreNLP code is written in Java and licensed under the GNU General Public License (v3 or later). Note that this is the full GPL, which allows many free uses, but not its use in proprietary software that you distribute to others.

Support

Quality

Security

License

Reuse

Support

CoreNLP has a medium active ecosystem.

It has 9050 star(s) with 2687 fork(s). There are 493 watchers for this library.

It had no major release in the last 12 months.

There are 173 open issues and 891 have been closed. On average issues are closed in 107 days. There are 2 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of CoreNLP is v4.5.4

Quality

CoreNLP has no bugs reported.

Security

CoreNLP has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

CoreNLP is licensed under the GPL-3.0 License. This license is Strong Copyleft.

Strong Copyleft licenses enforce sharing, and you can use them when creating open source projects.

Reuse

CoreNLP releases are available to install and integrate.

Deployable package is available in Maven.

Build file is available. You can build the component from source.

Installation instructions, examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi has reviewed CoreNLP and discovered the below as its top functions. This is intended to give you an instant insight into CoreNLP implemented functionality, and help decide if they suit your requirements.

Get the next input .
Sets the option flag on the command line .
Extract the RVFDatum .
Makes a message pass through the message passing .
Helper function to check whether boundaries are inside bounds .
Performs inside the chart .
Convert the trees from the command line arguments .
Internal method used to print tree .
Gets the coreferent .
Initialize the environment .

Get all kandi verified functions for this library.

CoreNLP Key Features

No Key Features are available at this moment for CoreNLP.

CoreNLP Examples and Code Snippets

No Code Snippets are available at this moment for CoreNLP.

Community Discussions

Trending Discussions on CoreNLP

Stanford CoreNLP - Unknown variable WORKDAY

NLP Pipeline, DKPro, Ruta - Missing Descriptor Error

Meaning of output/training status of 256 in Stanford NLP NER?

Is there a way to use external libraries in IntelliJ without downloading their .jars?

Stanford Core NLP Tree Parser Sentence Limits wrong - suggestions?

How can I iterate token attributes with coreference results in CoreNLP?

Stanford CoreNLP: Java can't find or load main class / java.ClassNotFoundException

Issue in creating Semgrex patterns with relation names containing ":" colon

Access server running on docker container

Pattern matching with tregex in Stanzas Corenlp implementation doesn't seem to finde the right subtrees

QUESTION

Stanford CoreNLP - Unknown variable WORKDAY

Asked 2021-Nov-20 at 19:28

I am processing some documents and I am getting many WORKDAY messages as seen below. There's a similar issue posted here for WEEKDAY. Does anyone know how to deal with this message. I am running corenlp in a Java server on Windows and accessing it using Juypyter Notebook and Python code.

...

ANSWER

Answered 2021-Nov-20 at 19:28

This is an error in the current SUTime rules file (and it's actually been there for quite a few versions). If you want to fix it immediately, you can do the following. Or we'll fix it in the next release. These are Unix commands, but the same thing will work elsewhere except for how you refer to and create folders.

Find this line in sutime/english.sutime.txt and delete it. Save the file.

{ (/workday|work day|business hours/) => WORKDAY }

Then move the file to the right location for replacing in the jar file, and then replace it in the jar file. In the root directory of the CoreNLP distribution do the following (assuming you don't already have an edu file/folder in that directory):

Source https://stackoverflow.com/questions/69955279

QUESTION

NLP Pipeline, DKPro, Ruta - Missing Descriptor Error

Asked 2021-Aug-15 at 10:09

I am trying to run a RUTA script with an analysis pipeline.

I add my script to the pipeline like so createEngineDescription(RutaEngine.class, RutaEngine.PARAM_MAIN_SCRIPT, "mypath/myScript.ruta)

My ruta script file contains this:

...

ANSWER

Answered 2021-Aug-15 at 10:09

I solved the problem. This error was being thrown simply because the script could not be found and I had to change this line from: RutaEngine.PARAM_MAIN_SCRIPT, "myscript.ruta" to: RutaEngine.PARAM_MAIN_SCRIPT, "myscript"

However, I did a few other things before this that may have contributed to the solution so I am listing them here:

I added the ruta nature to my eclipse project
I moved the myscript from resources to a script package

Source https://stackoverflow.com/questions/68784592

QUESTION

Meaning of output/training status of 256 in Stanford NLP NER?

Asked 2021-Jul-28 at 05:35

I have a Python program where I am using os.sys to train the Stanford NER from the command line. This returns an output/training status which I save in the variable "status", and it is usually 0. However, I just ran it and got an output of 256, as well as not creating a file for the trained model. This error is only occurring for larger sets of training data. I searched through the documentation on the Stanford NLP website and there doesn't seem to be info on the meanings of the outputs or why increasing training data might affect the training. Thanks in advance for any help and problem code is below.

...

ANSWER

Answered 2021-Jul-28 at 05:35

Status is an exit code, and non-zero exit codes mean your program failed. This is not a Stanford NLP convention, it's how all programs work on Unix/Linux.

There should be an error somewhere, maybe you ran out of memory? You'll have to track that down to find out what's wrong.

Source https://stackoverflow.com/questions/68546867

QUESTION

Is there a way to use external libraries in IntelliJ without downloading their .jars?

Asked 2021-Mar-11 at 20:04

I am trying to write a standalone Java application in IntelliJ using edu.stanford.nlp.trees.GrammaticalStructure. Therefore, I have imported the module:

...

ANSWER

Answered 2021-Mar-11 at 20:04

If you want to use it means you want to execute the code in them. How is the runtime supposed to execute code that is does not have? How is the compiler supposed to know how the code is defined (e.g. what the classes look like)? This is simply impossible. If you want to use the code you have to provide it to the compiler as well as the runtime.

If you just dont want to include all of that code into your application, you need either access to the sources and just pick the class you need or you need some kind of JAR minimizer as @CrazyCoder suggested.

Source https://stackoverflow.com/questions/66587615

QUESTION

Stanford Core NLP Tree Parser Sentence Limits wrong - suggestions?

Asked 2021-Feb-19 at 07:27

I'm dealing with german law documents and would like to generate parse trees for sentences. I could find and use Standford CoreNLP Parser. However, it does not recognize sentence limits as good as other tools (e.g. spaCy) when parsing the sentences of a document. For example, it would break sentences at every single '.'-character, incl. the dot at the end of abbreviations such as "incl.") Since it is crucial to cover the whole sentence for creating syntax trees, this does not really work out for me.

I would appreciate any suggestions to tackle this problem, espacially pointers to other software that might be better suited for my problem. If I overlooked the possibility to tweak the Stanford parser, I would be very grateful for any hints on how to make it better detect sentence limits.

...

ANSWER

Answered 2021-Feb-19 at 07:27

A quick glance into the docs did the trick: You can run your pipeline, which might include the sentence splitter, with the attribute ssplit.isOneSentence = true to basically disable it. This means you can split the sentences beforehand, e.g. using spaCy, and then feed single sentences into the pipeline.

Source https://stackoverflow.com/questions/66226460

QUESTION

How can I iterate token attributes with coreference results in CoreNLP?

Asked 2021-Jan-07 at 22:46

I am looking for a way to extract and merge annotation results from CoreNLP. To specify,

...

ANSWER

Answered 2021-Jan-07 at 22:46

The coref chains have a sentenceIndex and a beginIndex which should correlate to the position in the sentence. You can use this to correlate the two.

https://github.com/stanfordnlp/stanza/blob/f0338f891a03e242c7e11e440dec6e191d54ab77/doc/CoreNLP.proto#L319

Edit: quick and dirty change to your example code:

Source https://stackoverflow.com/questions/65542790

QUESTION

Stanford CoreNLP: Java can't find or load main class / java.ClassNotFoundException

Asked 2021-Jan-06 at 09:53

To be honest, Java is a mystery for me. I just started a few days ago to learn Python, but now I need to use Stanford CoreNLP which needs Java (GOD!!!!!)

When I import Stanford CoreNlp in CMD, it always shows "Error: can't find or load main class ... Reason: java.lang.ClassNotFoundException: ..."

But in fact, I have already made some changes in the environment (though they may not be correct).

It may be an error of the setting of environement path, but I really don't know how to solve it...

...

ANSWER

Answered 2021-Jan-06 at 09:53

You are facing with classpath issue

From your screenshot, current working directory is C:\Users(Name) which does not contains code of the SCNLP.

From Command Line Usage page, the minimal command to run Stanford CoreNLP from the command line is:

Source https://stackoverflow.com/questions/65593234

QUESTION

Issue in creating Semgrex patterns with relation names containing ":" colon

Asked 2020-Dec-19 at 09:46

I am trying to perform Semgrex in https://corenlp.run/ on the below sentence to extract the transition event. Since the dependency relation "obl:from" has a colon in it, I get an error. But instead, if I used nsubj, I get the desired result. Can someone tell me how to work around this?

My text: The automobile shall change states from OFF to ON when the driver is in control.

...

ANSWER

Answered 2020-Dec-19 at 09:46

Found answer Just wrap the expr within / / and it works! For eg.

Source https://stackoverflow.com/questions/65273410

QUESTION

Access server running on docker container

Asked 2020-Oct-07 at 08:08

I am running the StanfordCoreNLP server through my docker container. Now I want to access it through my python script.

Github repo I'm trying to run: https://github.com/swisscom/ai-research-keyphrase-extraction

I ran the command which gave me the following output:

...

ANSWER

Answered 2020-Oct-07 at 08:08

As seen in the log, your service is listening to port 9000 inside the container. However, from outside you need further information to be able to access it. Two pieces of information that you need:

The IP address of the container
The external port that docker exports this 9000 to the outside (by default docker does not export locally open ports).

To get the IP address you need to use docker inspect, for example via

Source https://stackoverflow.com/questions/64238613

QUESTION

Pattern matching with tregex in Stanzas Corenlp implementation doesn't seem to finde the right subtrees

Asked 2020-Sep-22 at 22:01

I am relatively new to NLP and at the moment I'm trying to extract different phrase scructures in german texts. For that I'm using the Stanford corenlp implementation of stanza with the tregex feature for pattern machting in trees.

So far I didn't have any problem an I was able to match simple patterns like "NPs" or "S > CS". No I'm trying to match S nodes that are immediately dominated either by ROOT or by a CS node that is immediately dominated by ROOT. For that im using the pattern "S > (CS > TOP) | > TOP". But it seems that it doesn't work properly. I'm using the following code:

...

ANSWER

Answered 2020-Sep-22 at 22:01

A few comments:

1.) Assuming you are using a recent version of CoreNLP (4.0.0+), you need to use the mwt annotator with German. So your annotators list should be tokenize,ssplit,mwt,pos,parse

2.) Here is your sentence in PTB for clarity:

Source https://stackoverflow.com/questions/64016461

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install CoreNLP

Several times a year we distribute a new version of the software, which corresponds to a stable commit. During the time between releases, one can always use the latest, under development version of our code.
Make sure you have Ant installed, details here: http://ant.apache.org/
Compile the code with this command: cd CoreNLP ; ant
Then run this command to build a jar with the latest version of the code: cd CoreNLP/classes ; jar -cf ../stanford-corenlp.jar edu
This will create a new jar called stanford-corenlp.jar in the CoreNLP folder which contains the latest code
The dependencies that work with the latest code are in CoreNLP/lib and CoreNLP/liblocal, so make sure to include those in your CLASSPATH.
When using the latest version of the code make sure to download the latest versions of the corenlp-models, english-models, and english-models-kbp and include them in your CLASSPATH. If you are processing languages other than English, make sure to download the latest version of the models jar for the language you are interested in.
Make sure you have Maven installed, details here: https://maven.apache.org/
If you run this command in the CoreNLP directory: mvn package , it should run the tests and build this jar file: CoreNLP/target/stanford-corenlp-4.4.0.jar
When using the latest version of the code make sure to download the latest versions of the corenlp-models, english-extra-models, and english-kbp-models and include them in your CLASSPATH. If you are processing languages other than English, make sure to download the latest version of the models jar for the language you are interested in.
If you want to use Stanford CoreNLP as part of a Maven project you need to install the models jars into your Maven repository. Below is a sample command for installing the Spanish models jar. For other languages just change the language name in the command. To install stanford-corenlp-models-current.jar you will need to set -Dclassifier=models. Here is the sample command for Spanish: mvn install:install-file -Dfile=/location/of/stanford-spanish-corenlp-models-current.jar -DgroupId=edu.stanford.nlp -DartifactId=stanford-corenlp -Dversion=4.4.0 -Dclassifier=models-spanish -Dpackaging=jar

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: