vcdiff | Heavily optimized .NET Core vcdiff library

 by   SnowflakePowered C# Version: Current License: Apache-2.0

kandi X-RAY | vcdiff Summary

kandi X-RAY | vcdiff Summary

vcdiff is a C# library typically used in Big Data applications. vcdiff has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. You can download it from GitHub.

This is a hard fork of VCDiff, originally written by Metric, written primarily for use in Snowflake. Large chunks have been rewritten, and heavily optimized to be extremely fast, using vector intrinsics, as well as Memory and Span APIs as well as a sprinkling of unsafe pointer access to eke out every bit of performance possible. Non-scientific preliminary testing shows up to a 30x to 50x speedup compared to the original library when diffing a 2MB file. Support for xdelta3 checksums have also been included. Testing was done with xdelta 3.1, support for xdelta 3.0 patch files has not been tested. Only patch files without external compression (-S none) are supported. Wherever possible, SSE3 or AVX2 extensions are used on supported systems. Speeds are comparable, albeit slightly slower than the native xdelta3, depending on the chosen blocksize. A lot of work has gone into optimizing out the overhead of garbage collection and memory access through Memory, as well as parallelizing computational work with SIMD extensions.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              vcdiff has a low active ecosystem.
              It has 7 star(s) with 8 fork(s). There are 2 watchers for this library.
              OutlinedDot
              It had no major release in the last 6 months.
              There are 1 open issues and 0 have been closed. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of vcdiff is current.

            kandi-Quality Quality

              vcdiff has 0 bugs and 0 code smells.

            kandi-Security Security

              vcdiff has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              vcdiff code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              vcdiff is licensed under the Apache-2.0 License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              vcdiff releases are not available. You will need to build from source code and install.
              Installation instructions are not available. Examples and code snippets are available.
              It has 409 lines of code, 0 functions and 36 files.
              It has low code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
            Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of vcdiff
            Get all kandi verified functions for this library.

            vcdiff Key Features

            No Key Features are available at this moment for vcdiff.

            vcdiff Examples and Code Snippets

            No Code Snippets are available at this moment for vcdiff.

            Community Discussions

            Trending Discussions on vcdiff

            QUESTION

            Delta encoders: Using Java library in Scala
            Asked 2020-Oct-24 at 10:14

            I have to compare using Spark-based big data analysis data sets (text files) that are very similar (>98%) but with very large sizes. After doing some research, I found that most efficient way could be to use delta encoders. With this I can have a reference text and store others as delta increments. However, I use Scala that does not have support for delta encoders, and I am not at all conversant with Java. But as Scala is interoperable with Java, I know that it is possible to get Java lib work in Scala.

            I found the promising implementations to be xdelta, vcdiff-java and bsdiff. With a bit more searching, I found the most interesting library, dez. The link also gives benchmarks in which it seems to perform very well, and code is free to use and looks lightweight.

            At this point, I am stuck with using this library in Scala (via sbt). I would appreciate any suggestions or references to navigate this barrier, either specific to this issue (delata encoders), library or in working with Java API in general within Scala. Specifically, my questions are:

            1. Is there a Scala library for delta encoders that I can directly use? (If not)

            2. Is it possible that I place the class files/notzed.dez.jar in the project and let sbt provide the APIs in the Scala code?

            I am kind of stuck in this quagmire and any way out would be greatly appreciated.

            ...

            ANSWER

            Answered 2020-Oct-24 at 10:14

            There are several details to take into account. There is no problem in using directly the Java libraries in Scala, either using as dependencies in sbt or using as unmanaged dependencies https://www.scala-sbt.org/1.x/docs/Library-Dependencies.html: "Dependencies in lib go on all the classpaths (for compile, test, run, and console)". You can create a fat jar with your code and dependencies with https://github.com/sbt/sbt-native-packager and distributed it with Spark Submit.

            The point here is to use these frameworks in Spark. To take advantage of Spark you would need split your files in blocks to distribute the algorithm across the cluster for one file. Or if your files are compressed and you have each of them in one hdfs partition you would need to adjust the size of the hdfs blocks, etc ...

            You can use the C modules and include them in your project and call them via JNI as frameworks like deep learning frameworks use the native linear algebra functions, etc. So, in essence, there are a lot to discuss about how to implement these delta algorithms in Spark.

            Source https://stackoverflow.com/questions/64505586

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install vcdiff

            You can download it from GitHub.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/SnowflakePowered/vcdiff.git

          • CLI

            gh repo clone SnowflakePowered/vcdiff

          • sshUrl

            git@github.com:SnowflakePowered/vcdiff.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link