kandi background
Explore Kits

Disinformation: Starter Kit - Terms of Service Labeling and Readability

by kandikits Updated: Jan 24, 2022

Timing is crucial for everyone in this era of globalization. Sifting through lots of documents can be difficult and time consuming. Without an abstract or summary, it can take minutes just to figure out what is provided in a paper or document. Summarizer is an algorithm that extracts sentences from a text document, determines which are most important, and returns them in a readable and structured way. In this challenge, we are inviting to build a solution for summarization on documents such as terms and conditions that preserves all the essential points of this document. You can parse different sections of the document based on their headings and create summaries for individual sections and finally show a merged summary for showing the essence of the document. Please see below a sample solution kit to jumpstart your solution on creating a simple summarizer application. To use this kit to build your own solution, scroll down to refer sections Kit Deployment Instructions and Instruction to Run. Complexity : Medium This sample solution kit does extractive summarization on the given document.

Development Environment

VSCode and Jupyter Notebook are used for development and debugging. Jupyter Notebook is a web based interactive environment often used for experiments, whereas VSCode is used to get a typical experience of IDE for developers.

notebookby jupyter

Jupyter Notebook star image 9702 Version:v7.0.0a11

License: Others (Non-SPDX)

Jupyter Interactive Notebook

Support
Quality
Security
License
Reuse

notebookby jupyter

Jupyter Notebook star image 9702 Version:v7.0.0a11 License: Others (Non-SPDX)

Jupyter Interactive Notebook
Support
Quality
Security
License
Reuse

vscodeby microsoft

TypeScript star image 141760 Version:1.74.3

License: Permissive (MIT)

Visual Studio Code

Support
Quality
Security
License
Reuse

vscodeby microsoft

TypeScript star image 141760 Version:1.74.3 License: Permissive (MIT)

Visual Studio Code
Support
Quality
Security
License
Reuse

Exploratory Data Analysis

For extensive analysis and exploration of data, and to deal with arrays, these libraries are used. They are also used for performing scientific computation and data manipulation.

numpyby numpy

Python star image 22522 Version:1.24.1

License: Permissive (BSD-3-Clause)

The fundamental package for scientific computing with Python.

Support
Quality
Security
License
Reuse

numpyby numpy

Python star image 22522 Version:1.24.1 License: Permissive (BSD-3-Clause)

The fundamental package for scientific computing with Python.
Support
Quality
Security
License
Reuse

pandasby pandas-dev

Python star image 36647 Version:1.5.2

License: Permissive (BSD-3-Clause)

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

Support
Quality
Security
License
Reuse

pandasby pandas-dev

Python star image 36647 Version:1.5.2 License: Permissive (BSD-3-Clause)

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Support
Quality
Security
License
Reuse

Text Mining

Libraries in this group are used for the analysis and processing of unstructured natural language

nltkby nltk

Python star image 11409 Version:3.8.1

License: Permissive (Apache-2.0)

NLTK Source

Support
Quality
Security
License
Reuse

nltkby nltk

Python star image 11409 Version:3.8.1 License: Permissive (Apache-2.0)

NLTK Source
Support
Quality
Security
License
Reuse

spaCyby explosion

Python star image 25086 Version:3.4.4

License: Permissive (MIT)

💫 Industrial-strength Natural Language Processing (NLP) in Python

Support
Quality
Security
License
Reuse

spaCyby explosion

Python star image 25086 Version:3.4.4 License: Permissive (MIT)

💫 Industrial-strength Natural Language Processing (NLP) in Python
Support
Quality
Security
License
Reuse

sentencepieceby google

C++ star image 6471 Version:0.1.97

License: Permissive (Apache-2.0)

Unsupervised text tokenizer for Neural Network-based text generation.

Support
Quality
Security
License
Reuse

sentencepieceby google

C++ star image 6471 Version:0.1.97 License: Permissive (Apache-2.0)

Unsupervised text tokenizer for Neural Network-based text generation.
Support
Quality
Security
License
Reuse

Machine Learning & Natural Language Processing

The library offers state-of-the-art pre-trained models for Natural Language Processing (NLP).

scikit-learnby scikit-learn

Python star image 52681 Version:1.2.0

License: Permissive (BSD-3-Clause)

scikit-learn: machine learning in Python

Support
Quality
Security
License
Reuse

scikit-learnby scikit-learn

Python star image 52681 Version:1.2.0 License: Permissive (BSD-3-Clause)

scikit-learn: machine learning in Python
Support
Quality
Security
License
Reuse

pytorchby pytorch

C++ star image 62094 Version:v1.13.1

License: Others (Non-SPDX)

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Support
Quality
Security
License
Reuse

pytorchby pytorch

C++ star image 62094 Version:v1.13.1 License: Others (Non-SPDX)

Tensors and Dynamic neural networks in Python with strong GPU acceleration
Support
Quality
Security
License
Reuse

transformersby huggingface

Python star image 78856 Version:4.25.1

License: Permissive (Apache-2.0)

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Support
Quality
Security
License
Reuse

transformersby huggingface

Python star image 78856 Version:4.25.1 License: Permissive (Apache-2.0)

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Support
Quality
Security
License
Reuse

sentence-transformersby UKPLab

Python star image 9198 Version:2.2.2

License: Permissive (Apache-2.0)

Multilingual Sentence & Image Embeddings with BERT

Support
Quality
Security
License
Reuse

sentence-transformersby UKPLab

Python star image 9198 Version:2.2.2 License: Permissive (Apache-2.0)

Multilingual Sentence & Image Embeddings with BERT
Support
Quality
Security
License
Reuse

Utilities

library tqdm can be used to show progress bar for any long running process step in the code

tqdmby tqdm

Python star image 23836 Version:v4.64.1

License: Others (Non-SPDX)

A Fast, Extensible Progress Bar for Python and CLI

Support
Quality
Security
License
Reuse

tqdmby tqdm

Python star image 23836 Version:v4.64.1 License: Others (Non-SPDX)

A Fast, Extensible Progress Bar for Python and CLI
Support
Quality
Security
License
Reuse

Testing

The libraries listed here can be used for unit testing as well as integration testing

pytestby pytest-dev

Python star image 9720 Version:7.2.0

License: Permissive (MIT)

The pytest framework makes it easy to write small tests, yet scales to support complex functional testing

Support
Quality
Security
License
Reuse

pytestby pytest-dev

Python star image 9720 Version:7.2.0 License: Permissive (MIT)

The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
Support
Quality
Security
License
Reuse

Kit Solution Source

bert-extractive-summarizerby dmmiller612

Python star image 1115 Version:0.10.1

License: Permissive (MIT)

Easy to use extractive text summarization with BERT

Support
Quality
Security
License
Reuse

bert-extractive-summarizerby dmmiller612

Python star image 1115 Version:0.10.1 License: Permissive (MIT)

Easy to use extractive text summarization with BERT
Support
Quality
Security
License
Reuse

Deployment Information

The entire solution is available as a package to download from the source code repository. Please add your kit solution or prototype source repository in this section.

For Windows OS, Download, extract and double-click kit_installer file to install the kit. Note: Do ensure to extract the zip file before running it. The installation may take from 2 to 10 minutes based on bandwidth. 1. When you're prompted during the installation of the kit, press Y to launch the app automatically and execute cells in the notebook by selecting Cell --> Run All from Menu bar 2. To run the app manually, press N when you're prompted and locate the zip file Text_Summarizer.zip 3. Extract the zip file and navigate to the directory bert-extractive-summarizer-master 4. Open command prompt in the extracted directory bert-extractive-summarizer-master and run the command jupyter notebook For other Operating System, 1. Click here to install python 2. Click here to download the repository 3. Extract the zip file and navigate to the directory bert-extractive-summarizer-master 4. Open terminal in the extracted directory bert-extractive-summarizer-master 5. Install dependencies by executing the command pip install -r requirements.txt 6. Run the command jupyter notebook

Instruction to Run

Follow the below instructions to run the solution. 1. Locate and open the Terms-of-service-summarizer.ipynb notebook from the Jupyter Notebook browser window. 2. Execute cells in the notebook by selecting Cell --> Run All from Menu bar. 3.Output file summarized_text.txt will be saved in bert-extractive-summarizer-master directory from the kit_installer.bat location. For summarizing with your text, 1. Open the input file sample.txt in the bert-extractive-summarizer-master directory from the kit_installer.bat location. 2. Update the text that you want to summarize. 3. Execute cells in the notebook by selecting Cell --> Run All from Menu bar. 4. Output file summarized_text.txt will be saved in bert-extractive-summarizer-master directory from the kit_installer.bat location. Input file: sample.txt-contains content to be summarized. Output file: summarized_text.txt-contains summarized content.

Input Parameters

Input Parameters: 1.text varibale specifies the input text that you want to summarize. 2.minimum_length refers to the minimum length to accept as a sentence for summarizing. 3.maximum_length refers to the maximum length to accept as a sentence for summarizing. 4.sentences specifies the number of sentences in summarized text. You can additionally build interfaces and other enhancements for additional score. For any support, you can direct message us at #help-with-kandi-kits

Troubleshooting

1. While running batch file, if you encounter Windows protection alert, select More info --> Run anyway 2. During kit installer, if you encounter Windows security alert, click Allow

Support

For any support, you can direct message us at #help-with-kandi-kits