gdrcopy | fast GPU memory copy library based on NVIDIA GPUDirect RDMA | GPU library

by NVIDIA C++ Version: v2.3.1-1 License: MIT

X-Ray Key Features Code Snippets Community Discussions(2)Vulnerabilities Install Support

kandi X-RAY | gdrcopy Summary

gdrcopy is a C++ library typically used in Hardware, GPU applications. gdrcopy has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. You can download it from GitHub.

While GPUDirect RDMA is meant for direct access to GPU memory from third-party devices, it is possible to use these same APIs to create perfectly valid CPU mappings of the GPU memory. The advantage of a CPU driven copy is the very small overhead involved. That might be useful when low latencies are required.

Support

Quality

Security

License

Reuse

Support

gdrcopy has a low active ecosystem.

It has 624 star(s) with 129 fork(s). There are 53 watchers for this library.

It had no major release in the last 12 months.

There are 35 open issues and 121 have been closed. On average issues are closed in 50 days. There are 11 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of gdrcopy is v2.3.1-1

Quality

gdrcopy has no bugs reported.

Security

gdrcopy has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

gdrcopy is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

gdrcopy releases are available to install and integrate.

Installation instructions, examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of gdrcopy

Get all kandi verified functions for this library.

gdrcopy Key Features

No Key Features are available at this moment for gdrcopy.

gdrcopy Examples and Code Snippets

No Code Snippets are available at this moment for gdrcopy.

Community Discussions

Trending Discussions on gdrcopy

How to enable CUDA Aware OpenMPI?

MVAPICH deadlocks on CUDA memory while kernel is running

QUESTION

How to enable CUDA Aware OpenMPI?

Asked 2020-Oct-12 at 08:45

I'm using OpenMPI and I need to enable CUDA aware MPI. Together with MPI I'm using OpenACC with the hpc_sdk software.

Following https://www.open-mpi.org/faq/?category=buildcuda I downloaded and installed UCX (not gdrcopy, I haven't managed to install it) with

./contrib/configure-release --with-cuda=/opt/nvidia/hpc_sdk/Linux_x86_64/20.7/cuda/11.0 CC=pgcc CXX=pgc++ --disable-fortran

and it prints:

...

ANSWER

Answered 2020-Oct-09 at 20:15

This was an issue in the 20.7 release when adding UCX support. You can lower the optimization level to -O1 work around the problem, or update your NV HPC compiler version to 20.9 where we've resolved the issue.

https://developer.nvidia.com/nvidia-hpc-sdk-version-209-downloads

Source https://stackoverflow.com/questions/64281545

QUESTION

MVAPICH deadlocks on CUDA memory while kernel is running

Asked 2017-Mar-20 at 16:11

I try to get a MPI-CUDA program working with MVAPICH CUDA8. I did run the program successfully with openMPI before but I want to test if I get better performance with MVAPICH. Unfortunately the program gets stuck in MPI_Isend if a CUDA kernel is running at the same time when using MVAPICH.

I downloaded MVAPICH2-2.2 and built it from the source with the configuration flags

--enable-cuda --disable-mcast

to enable MPI calls on cuda memory. mcast was disabled because I could not compile it without the flag.

I used the following flags before running the application:

...

ANSWER

Answered 2017-Mar-20 at 16:11

I got back to this problem and used gdb to debug the code.

Apparently, the problem is the eager protocol of MVAPICH2 implemented in src/mpid/ch3/channels/mrail/src/gen2/ibv_send.c. The eager protocol uses a cuda_memcpy without async, which blocks until the kernel execution finishes.

The program posted in the question runs fine by passing MV2_IBA_EAGER_THRESHOLD 1 to mpirun. This prevents MPI to use the eager protocol and uses the rendez-vous protocol instead.

Patching the MVAPICH2 source code does solve the problem as well. I changed the synchronous cudaMemcpys to cudaMemcpyAsync in the files

src/mpid/ch3/channels/mrail/src/gen2/ibv_send.c
src/mpid/ch3/channels/mrail/src/gen2/ibv_recv.c
src/mpid/ch3/src/ch3u_request.c

The change in the third file is only needed for MPI_Isend/MPI_Irecv. Other MPI functions might need some additional code changes.

Source https://stackoverflow.com/questions/42420146

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install gdrcopy

We provide three ways for building and installing GDRCopy.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: