 by   snover Java Version: v2 License: Non-SPDX

terp is a Java library typically used in Utilities, Translation, Deep Learning applications. terp has no bugs, it has no vulnerabilities and it has low support. However terp build file is not available and it has a Non-SPDX License. You can download it from GitHub.

TERp is an automatic evaluation metric for Machine Translation, which takes as input a set of reference translations, and a set of machine translation output for that same data. It aligns the MT output to the reference translations, and measures the number of 'edits' needed to transform the MT output into the reference translation. TERp is an extension of TER (Translation Edit Rate) that utilizes phrasal substitutions (using automatically generated paraphrases), stemming, synonyms, relaxed shifting constraints and other improvements. TERp is named after the University of Maryland mascot: the Terrapin, so it's pronounced "terp". For a technical description of TERp, please refer to doc/terp_description.pdf.

            kandi-support Support

              terp has a low active ecosystem.
              It has 27 star(s) with 11 fork(s). There are 4 watchers for this library.
              It had no major release in the last 12 months.
              There are 0 open issues and 3 have been closed. On average issues are closed in 186 days. There are 1 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of terp is v2

            kandi-Quality Quality

              terp has 0 bugs and 0 code smells.

            kandi-Security Security

              terp has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              terp code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              terp has a Non-SPDX License.
              Non-SPDX licenses can be open source with a non SPDX compliant license, or non open source licenses, and you need to review them closely before use.

            kandi-Reuse Reuse

              terp releases are available to install and integrate.
              terp has no build file. You will be need to create the build yourself to build the component from source.
              Installation instructions, examples and code snippets are available.
              terp saves you 6434 person hours of effort in developing the same functionality from scratch.
              It has 13380 lines of code, 780 functions and 48 files.
              It has high code complexity. Code complexity directly impacts maintainability of the code.

            No vulnerabilities reported

            Install terp

            These instructions are for use on a UNIX-like operating system.
            TERp requires Java version 1.5.0 or higher.
            Build TERp by running ant clean; ant in the root of the repository.
            Download and install WordNet version 3.0. (Note: if you are on OS X, and are using macports, you can simply do sudo port install wordnet.)
            Download the compressed paraphrase table (unfiltered_phrasetable.txt.gz) from the GitHub releases page to the data directory and uncompress it.
            Several shell scripts are provided to simplify the process of running TERp. To setup these scripts run: bin/setup_bin.sh <PATH_TO_TERP> <PATH_TO_JAVA> <PATH_TO_WORDNET> where: <PATH_TO_TERP> points to the directory where you checked out this repository, such that <PATH_TO_TERP>/bin/setup_bin.sh exists. <PATH_TO_JAVA> points to the root of the Java 1.5.0+ directory such that <PATH_TO_JAVA>/bin/java exists. <PATH_TO_WORDNET> points to the root of the WordNet 3 installation such that <PATH_TO_WORDNET>/dict exists. (Note: if you are on OS X, and you installed wordnet using macports with default options, you can set this to /opt/local/share/WordNet-3.0). Running this script will create the following additional wrapper scripts: bin/terp bin/terpa bin/terp_ter bin/tercom bin/create_phrasedb bin/optimize_db and create the parameter file: data/data_loc.param
            Generate a TERp compatible paraphrase table from the text-based paraphrase file you downloaded in Step 4 by running: bin/create_phrasedb data/unfiltered_phrasetable.txt data/phrases.db IMPORTANT: This step could take a while and will require several gigabytes of diskspace, as the text version of the phrase table is converted to a Berkley style database. The conversion tool also expects to have 1-3 GBs of memory available. This requirement can be reduced if necessary in the bin/create_phrasedb script. This step will generate a phrase table database in data/phrases.db and will only need to be run once. Running this step again will add to the existing database, not overwrite it. The paraphrases used in this database were extracted using the pivot-based method (Bannard and Callison-Burch, 2005) with several additional filtering mechanisms to increase precision. The corpus used for extraction was an Arabic-English newswire bitext containing approximately 1 million sentences.
            You can run some validation experiments to test the installation. From the root of the repository, run: mkdir -p test/output ./bin/create_phrasedb test/sample.pt.txt test/sample.pt.db ./bin/terpa test/sample.terp.param This will create a small phrase database from the file test/sample.pt.txt and store that database as test/sample.pt.db. We will use this sample database for our test since using the full database will be slower. TERpA will then be run on the hypothesis and reference files in test/ with the output placed in test/output/ as specified in the test/sample.terp.param parameter file. The correct version of these output files is provided in test/correct_output/. Running the three commands above should yield the following output (with appropriate substitutions for local file paths): $> mkdir -p test/output $> ./bin/create_phrasedb test/sample.pt.txt test/sample.pt.db Converting Phrase Table from test/sample.pt.txt Storing Database in test/sample.pt.db Done adding phrases to test/sample.pt.db $> ./bin/terpa test/sample.terp.param Loading parameters from /Users/nmadnani/work/terp/data/terpa.param Loading parameters from /Users/nmadnani/work/terp/data/data_loc.param Loading test/sample.terp.param as parameter file "test/sample.hyp.sgm" was successfully parsed as XML "test/sample.ref.sgm" was successfully parsed as XML Creating Segment Phrase Tables From DB Processing [ihned.cz/2008/09/29/36559][0001] Processing [ihned.cz/2008/09/29/36559][0002] Processing [ihned.cz/2008/09/29/36559][0003] Processing [ihned.cz/2008/09/30/36776][0001] Processing [ihned.cz/2008/09/30/36776][0002] Processing [ihned.cz/2008/09/30/36776][0003] Processing [ihned.cz/2008/09/30/36776][0004] Processing [ihned.cz/2008/09/30/36776][0005] Processing [ihned.cz/2008/09/30/36776][0006] Finished Calculating TERp Total TER: 0.48 (91.13 / 188.00)


