kandi background

Disinformation: Starter Kit - Terms of Service Labeling and Readability

by kandikits

Timing is crucial for everyone in this era of globalization. Sifting through lots of documents can be difficult and time consuming. Without an abstract or summary, it can take minutes just to figure out what is provided in a paper or document. Summarizer is an algorithm that extracts sentences from a text document, determines which are most important, and returns them in a readable and structured way. In this challenge, we are inviting to build a solution for summarization on documents such as terms and conditions that preserves all the essential points of this document. You can parse different sections of the document based on their headings and create summaries for individual sections and finally show a merged summary for showing the essence of the document. Please see below a sample solution kit to jumpstart your solution on creating a simple summarizer application. To use this kit to build your own solution, scroll down to refer sections Kit Deployment Instructions and Instruction to Run. Complexity : Medium This sample solution kit does extractive summarization on the given document.

Development Environment

VSCode and Jupyter Notebook are used for development and debugging. Jupyter Notebook is a web based interactive environment often used for experiments, whereas VSCode is used to get a typical experience of IDE for developers.

Exploratory Data Analysis

For extensive analysis and exploration of data, and to deal with arrays, these libraries are used. They are also used for performing scientific computation and data manipulation.

Text Mining

Libraries in this group are used for the analysis and processing of unstructured natural language

Machine Learning & Natural Language Processing

The library offers state-of-the-art pre-trained models for Natural Language Processing (NLP).

Kit Solution Source

The entire solution is available as a package to download from the source code repository. Please add your kit solution or prototype source repository in this section.

Kit Deployment Instructions

For Windows OS, Download, extract and double-click kit_installer file to install the kit. Note: Do ensure to extract the zip file before running it. The installation may take from 2 to 10 minutes based on bandwidth. 1. When you're prompted during the installation of the kit, press Y to launch the app automatically and execute cells in the notebook by selecting Cell --> Run All from Menu bar 2. To run the app manually, press N when you're prompted and locate the zip file Text_Summarizer.zip 3. Extract the zip file and navigate to the directory bert-extractive-summarizer-master 4. Open command prompt in the extracted directory bert-extractive-summarizer-master and run the command jupyter notebook For other Operating System, 1. Click here to install python 2. Click here to download the repository 3. Extract the zip file and navigate to the directory bert-extractive-summarizer-master 4. Open terminal in the extracted directory bert-extractive-summarizer-master 5. Install dependencies by executing the command pip install -r requirements.txt 6. Run the command jupyter notebook

Instruction to Run

Follow the below instructions to run the solution. 1. Locate and open the Terms-of-service-summarizer.ipynb notebook from the Jupyter Notebook browser window. 2. Execute cells in the notebook by selecting Cell --> Run All from Menu bar. 3.Output file summarized_text.txt will be saved in bert-extractive-summarizer-master directory from the kit_installer.bat location. For summarizing with your text, 1. Open the input file sample.txt in the bert-extractive-summarizer-master directory from the kit_installer.bat location. 2. Update the text that you want to summarize. 3. Execute cells in the notebook by selecting Cell --> Run All from Menu bar. 4. Output file summarized_text.txt will be saved in bert-extractive-summarizer-master directory from the kit_installer.bat location. Input file: sample.txt-contains content to be summarized. Output file: summarized_text.txt-contains summarized content.

Input Parameters

Input Parameters: 1.text varibale specifies the input text that you want to summarize. 2.minimum_length refers to the minimum length to accept as a sentence for summarizing. 3.maximum_length refers to the maximum length to accept as a sentence for summarizing. 4.sentences specifies the number of sentences in summarized text. You can additionally build interfaces and other enhancements for additional score. For any support, you can direct message us at #help-with-kandi-kits

Troubleshooting

1. While running batch file, if you encounter Windows protection alert, select More info --> Run anyway 2. During kit installer, if you encounter Windows security alert, click Allow

Support

For any support, you can direct message us at #help-with-kandi-kits