vulkan-kompute | General purpose GPU compute framework for cross vendor | Machine Learning library

 by   EthicalML C++ Version: v0.7.0 License: Non-SPDX

kandi X-RAY | vulkan-kompute Summary

kandi X-RAY | vulkan-kompute Summary

vulkan-kompute is a C++ library typically used in Artificial Intelligence, Machine Learning, Deep Learning applications. vulkan-kompute has no bugs, it has no vulnerabilities and it has low support. However vulkan-kompute has a Non-SPDX License. You can download it from GitHub.

Join the Discord for Questions / Chat Documentation Blog Post Examples .
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              vulkan-kompute has a low active ecosystem.
              It has 402 star(s) with 35 fork(s). There are 14 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              There are 51 open issues and 95 have been closed. On average issues are closed in 36 days. There are 2 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of vulkan-kompute is v0.7.0

            kandi-Quality Quality

              vulkan-kompute has no bugs reported.

            kandi-Security Security

              vulkan-kompute has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

            kandi-License License

              vulkan-kompute has a Non-SPDX License.
              Non-SPDX licenses can be open source with a non SPDX compliant license, or non open source licenses, and you need to review them closely before use.

            kandi-Reuse Reuse

              vulkan-kompute releases are available to install and integrate.
              Installation instructions, examples and code snippets are available.

            Top functions reviewed by kandi - BETA

            kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
            Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of vulkan-kompute
            Get all kandi verified functions for this library.

            vulkan-kompute Key Features

            No Key Features are available at this moment for vulkan-kompute.

            vulkan-kompute Examples and Code Snippets

            No Code Snippets are available at this moment for vulkan-kompute.

            Community Discussions

            QUESTION

            How to execute parallel compute shaders across multiple compute queues in Vulkan?
            Asked 2020-Oct-17 at 17:47

            Update: This has been solved, you can find further details here: https://stackoverflow.com/a/64405505/1889253

            A similar question was asked previously, but that question was initially focused around using multiple command buffers, and triggering the submit across different threads to achieve parallel execution of shaders. Most of the answers suggest that the solution is to use multiple queues instead. The use of multiple queues also seems to be the consensus across various blog posts and Khronos forum answers. I have attempted those suggestions running shader executions across multiple queues but without being able to see parallel execution, so I wanted to ask what I may be doing wrong. As suggested, this question includes the runnable code of multiple compute shaders being submitted to multiple queues, which hopefully can be useful for other people looking to do the same (once this is resolved).

            The current implementation is in this pull request / branch, however I will cover the main Vulkan specific points, to ensure only Vulkan knowledge is required to answer this question. It's also worth mentioning that the current use-case is specifically for compute queues and compute shaders, not graphics or transfer queues (although insights/experience achieving parallelism across those would still be very useful, and would most probably also lead to the answer).

            More specifically, I have the following:

            A couple of points that are not visible in the examples above but are important:

            • All evalAsync run on the same application, instance and device
            • Each evalAsync executes with its own separate commandBuffer and buffers, and in a separate queue
            • If you are wondering whether memory barriers could be having something to do, we have tried by removing all memoryBarriers (this on for example that runs before shader execution) completely but this has not made any difference on performance

            The test that is used in the benchmark can be found here, however the only key things to understand are:

            • This is the shader that we use for testing, as you can see, we just add a bunch of atomicAdd steps to increase the amount of processing time
            • Currently the test has small buffer size and high number of shader loop iterations, but we also tested with large buffer size (i.e. 100,000 instead of 10), and smaller iteration (1,000 istead of 100,000,000).

            When running the test, we first run a set of "synchronous" shader executions on the same queue (the number is variable but we've tested with 6-16, the latter which is the max number of queues). Then we run these in an asychrnonous manner, where we run all of them and the evalAwait until they are finished. When comparing the resulting times from both approaches, they take the same amount of time eventhough they run across different compute queues.

            My questions are:

            • Am I currently missing something when fetching the queues?
            • Are there further parameters in the vulkan setup that need to be configured to ensure asynchronous execution?
            • Are there any restrictions I may not be aware about around potentially operating system processes only being able to submit GPU workloads in a synchronous way to the GPU?
            • Would multithreading be required in order for parallel execution to work properly when dealing with multiple queue submissions?

            Furthermore I have found several useful resources online across various reddit posts and Khronos Group forums that provide very in-depth conceptual and theoretical overviews on the topic, but I haven't come across end to end code examples that show parallel execution of shaders. If there are any practical examples out there that you can share, which have funcioning parallel execution of shaders, that would be very helpful.

            If there are further details or questions that can help provide further context please let me know, happy to answer them and/or provide more detail.

            For completeness, my tests were using:

            • Vulkan SDK 1.2
            • Windows 10
            • NVIDIA 1650

            Other relevant links that have been shared in similar posts:

            ...

            ANSWER

            Answered 2020-Oct-16 at 22:18

            You are getting "asynchronous execution". You just don't expect it to behave the way it behaves.

            On a CPU, if you have one thread active, then you're using one CPU core (or hyper-thread). All of that core's execution and computation capabilities are given to your thread alone (ignoring pre-emption). But at the same time, if there are other cores, your one thread cannot use any of the computational resources of those cores. Not unless you create another thread.

            GPUs don't work that way. A queue is not like a CPU thread. It does not specifically relate to a particular quantity of computational resources. A queue is merely the interface through which commands get executed; the underlying hardware decides how to farm out commands to the various compute resources provided by the GPU as a whole.

            What generally happens when you execute a command is that the hardware attempts to fully saturate the available shader execution units using your command. If there are more shader units available than the number of invocations your operation requires, then some resources are available immediately for the next command. But if not, then the entire GPU's compute resources will be dedicated to executing the first operation; the second one must wait for resources to become available before it can start.

            It doesn't matter how many compute queues you shove work into; they're all going to try to use as many compute resources as possible. So they will largely execute in some particular order.

            Queue priority systems exist, but these mainly help determine the order of execution for commands. That is, if a high-priority queue has some commands that need to be executed, then they will take priority the next time compute resources become available for a new command.

            So submitting 3 dispatch batches on 3 separate queues is not going to complete faster than submitting 1 batch on one queue containing 3 dispatch operations.

            The main reason multiple queues (of the same family) exist is to be able to submit work from multiple threads without having them do inter-thread synchronization (and to provide some possible prioritization of submissions).

            Source https://stackoverflow.com/questions/64384786

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install vulkan-kompute

            Below you can find a GPU multiplication example using the C++ and Python Kompute interfaces. You can join the Discord for questions/discussion, open a github issue, or read the documentation.
            The build system provided uses cmake, which allows for cross platform builds.

            Support

            If you want to run with debug layers you can add them with the KOMPUTE_ENV_DEBUG_LAYERS parameter as:.
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/EthicalML/vulkan-kompute.git

          • CLI

            gh repo clone EthicalML/vulkan-kompute

          • sshUrl

            git@github.com:EthicalML/vulkan-kompute.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link