RP-DBSCAN | recent trends in big data processing

 by   kaist-dmlab Java Version: Current License: Apache-2.0

kandi X-RAY | RP-DBSCAN Summary

kandi X-RAY | RP-DBSCAN Summary

RP-DBSCAN is a Java library. RP-DBSCAN has no vulnerabilities, it has a Permissive License and it has low support. However RP-DBSCAN has 49 bugs and it build file is not available. You can download it from GitHub.

Following the recent trends in big data processing, several parallel DBSCAN algorithms have been reported in the literature. In most such algorithms, neighboring points are assigned to the same data partition for parallel processing to facilitate calculation of the density of the neighbors. This data partitioning scheme causes a few critical problems including load imbalance between data partitions, especially in a skewed data set. To remedy these problems, we propose a cell-based data partitioning scheme, pseudo random partitioning, that randomly distributes small cells rather than the points themselves. It achieves high load balance regardless of data skewness while retaining the data contiguity required for DBSCAN. In addition, we build and broadcast a highly compact summary of the entire data set, which we call a two-level cell dictionary, to supplement random partitions. Then, we develop a novel parallel DBSCAN algorithm, Random Partitioning-DBSCAN (shortly, RPDBSCAN), that uses pseudo random partitioning together with a two-level cell dictionary. The algorithm simultaneously finds the local clusters to each data partition and then merges these local clusters to obtain global clustering. To validate the merit of our approach, we implement RP-DBSCAN on Spark and conduct extensive experiments using various real-world data sets on 12 Microsoft Azure machines (48 cores). In RP-DBSCAN, data partitioning and cluster merging are very light, and clustering on each split is not dragged out by a specific worker. Therefore, the performance results show that RP-DBSCAN significantly outperforms the state-of-the-art algorithms by up to 180 times.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              RP-DBSCAN has a low active ecosystem.
              It has 46 star(s) with 8 fork(s). There are 6 watchers for this library.
              OutlinedDot
              It had no major release in the last 6 months.
              RP-DBSCAN has no issues reported. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of RP-DBSCAN is current.

            kandi-Quality Quality

              OutlinedDot
              RP-DBSCAN has 49 bugs (33 blocker, 1 critical, 7 major, 8 minor) and 624 code smells.

            kandi-Security Security

              RP-DBSCAN has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              RP-DBSCAN code analysis shows 0 unresolved vulnerabilities.
              There are 15 security hotspots that need review.

            kandi-License License

              RP-DBSCAN is licensed under the Apache-2.0 License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              RP-DBSCAN releases are not available. You will need to build from source code and install.
              RP-DBSCAN has no build file. You will be need to create the build yourself to build the component from source.
              Installation instructions are not available. Examples and code snippets are available.
              RP-DBSCAN saves you 1549 person hours of effort in developing the same functionality from scratch.
              It has 3449 lines of code, 208 functions and 28 files.
              It has medium code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed RP-DBSCAN and discovered the below as its top functions. This is intended to give you an instant insight into RP-DBSCAN implemented functionality, and help decide if they suit your requirements.
            • Given a set of edge splits returns a list of edge edges
            • Adds a new binary node
            • Build minimum spanning forest
            • L2 norm
            • Equivalent to L2 norm
            • Generate the metadata with approximate approximations
            • This method is used to get the codepoint coordinates
            • Calculate the state of the polynomial with a sphere
            • Calculates the distance between a point and a sphere
            • Finds the nearest node in the polynomial
            • Calculates the nearest nearest neighbor to the given coordinates
            • Returns the set of coordinates for a lv1 id
            • Build the neighbor search tree without the LVP
            • Get the grid coordinates for the level 1
            • Gets the index of the SVG coordinates for a given ID
            • Check if the map contains a cell
            • Creates a unique hash code for this vector
            • Gets neighbor node
            • Sets min and max values for the given partition
            • Gets the counts of two partition at the same partition
            • Sets the min and max coordinates
            • Returns a string representation of this node
            • Gets neighbor id
            • Finds the nearest neighbor of this node
            • Build the neighbor search tree
            • Main entry point
            Get all kandi verified functions for this library.

            RP-DBSCAN Key Features

            No Key Features are available at this moment for RP-DBSCAN.

            RP-DBSCAN Examples and Code Snippets

            No Code Snippets are available at this moment for RP-DBSCAN.

            Community Discussions

            No Community Discussions are available at this moment for RP-DBSCAN.Refer to stack overflow page for discussions.

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install RP-DBSCAN

            You can download it from GitHub.
            You can use RP-DBSCAN like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the RP-DBSCAN component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/kaist-dmlab/RP-DBSCAN.git

          • CLI

            gh repo clone kaist-dmlab/RP-DBSCAN

          • sshUrl

            git@github.com:kaist-dmlab/RP-DBSCAN.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Consider Popular Java Libraries

            CS-Notes

            by CyC2018

            JavaGuide

            by Snailclimb

            LeetCodeAnimation

            by MisterBooo

            spring-boot

            by spring-projects

            Try Top Libraries by kaist-dmlab

            SELFIE

            by kaist-dmlabPython

            STARE

            by kaist-dmlabJava

            NETS

            by kaist-dmlabJava

            k-Medoid

            by kaist-dmlabJava

            RecencyBias

            by kaist-dmlabPython