HIPIFY | HIPIFY: Convert CUDA to Portable C++ Code | GPU library

 by   ROCm-Developer-Tools C++ Version: rocm-5.5.1 License: MIT

kandi X-RAY | HIPIFY Summary

kandi X-RAY | HIPIFY Summary

HIPIFY is a C++ library typically used in Hardware, GPU applications. HIPIFY has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. You can download it from GitHub.

hipify-clang is a clang-based tool for translating CUDA sources into HIP sources. It translates CUDA source into an abstract syntax tree, which is traversed by transformation matchers. After applying all the matchers, the output HIP source is produced.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              HIPIFY has a low active ecosystem.
              It has 251 star(s) with 48 fork(s). There are 20 watchers for this library.
              There were 2 major release(s) in the last 12 months.
              There are 22 open issues and 174 have been closed. On average issues are closed in 42 days. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of HIPIFY is rocm-5.5.1

            kandi-Quality Quality

              HIPIFY has 0 bugs and 0 code smells.

            kandi-Security Security

              HIPIFY has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              HIPIFY code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              HIPIFY is licensed under the MIT License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              HIPIFY releases are available to install and integrate.
              Installation instructions are not available. Examples and code snippets are available.

            Top functions reviewed by kandi - BETA

            kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
            Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of HIPIFY
            Get all kandi verified functions for this library.

            HIPIFY Key Features

            No Key Features are available at this moment for HIPIFY.

            HIPIFY Examples and Code Snippets

            No Code Snippets are available at this moment for HIPIFY.

            Community Discussions

            QUESTION

            What are the requirements for using `shfl` operations on AMD GPU using HIP C++?
            Asked 2017-Jul-17 at 05:03

            There is AMD HIP C++ which is very similar to CUDA C++. Also AMD created Hipify to convert CUDA C++ to HIP C++ (Portable C++ Code) which can be executed on both nVidia GPU and AMD GPU: https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP

            requirement for nvidia

            please make sure you have a 3.0 or higher compute capable device in order to use warp shfl operations and add -gencode arch=compute=30, code=sm_30 nvcc flag in the Makefile while using this application.

            In addition, HIP defines portable mechanisms to query architectural features, and supports a larger 64-bit wavesize which expands the return type for cross-lane functions like ballot and shuffle from 32-bit ints to 64-bit ints.

            But which of AMD GPUs does support functions shfl, or does any AMD GPU support shfl because on AMD GPU it implemented by using Local-memory without hardware instruction register-to-register?

            nVidia GPU required 3.0 or higher compute capable (CUDA CC), but what are the requirements for using shfl operations on AMD GPU using HIP C++?

            ...

            ANSWER

            Answered 2017-Mar-06 at 14:35
            1. Yes, there are new instructions in GPU GCN3 such as ds_bpermute and ds_permute which can provide the functionality such as __shfl() and even more

            2. These ds_bpermute and ds_permute instructions use only route of Local memory (LDS 8.6 TB/s), but don't actually use Local memory, this allows to accelerate data exchange between threads: 8.6 TB/s < speed < 51.6 TB/s: http://gpuopen.com/amd-gcn-assembly-cross-lane-operations/

            They use LDS hardware to route data between the 64 lanes of a wavefront, but they don’t actually write to an LDS location.

            1. Also there are Data-Parallel Primitives (DPP) - is especially powerful when you can use it since an op can read registers of neighboring workitems directly. I.e. DPP can access to neighboring thread (workitem) at full speed ~51.6 TB/s

            http://gpuopen.com/amd-gcn-assembly-cross-lane-operations/

            now, most of the vector instructions can do cross-lane reading at full throughput.

            For example, wave_shr-instruction (Wavefront shift right) for Scan algorithm:

            More about GCN3: https://github.com/olvaffe/gpu-docs/raw/master/amd-open-gpu-docs/AMD_GCN3_Instruction_Set_Architecture.pdf

            New Instructions

            • “SDWA” – Sub Dword Addressing allows access to bytes and words of VGPRs in VALU instructions.
            • “DPP” – Data Parallel Processing allows VALU instructions to access data from neighboring lanes.
            • DS_PERMUTE_RTN_B32, DS_BPERMPUTE_RTN_B32.

            ...

            DS_PERMUTE_B32 Forward permute. Does not write any LDS memory.

            Source https://stackoverflow.com/questions/42468984

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install HIPIFY

            You can download it from GitHub.

            Support

            To generate the above documentation with the actual information about all supported CUDA APIs in Markdown format, run hipify-clang --md with or without output directory specifying (-o).
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/ROCm-Developer-Tools/HIPIFY.git

          • CLI

            gh repo clone ROCm-Developer-Tools/HIPIFY

          • sshUrl

            git@github.com:ROCm-Developer-Tools/HIPIFY.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Explore Related Topics

            Consider Popular GPU Libraries

            taichi

            by taichi-dev

            gpu.js

            by gpujs

            hashcat

            by hashcat

            cupy

            by cupy

            EASTL

            by electronicarts

            Try Top Libraries by ROCm-Developer-Tools

            HIP

            by ROCm-Developer-ToolsC++

            HIP-Examples

            by ROCm-Developer-ToolsC++

            aomp

            by ROCm-Developer-ToolsC

            rocprofiler

            by ROCm-Developer-ToolsC++

            HIP-CPU

            by ROCm-Developer-ToolsC++