vexcl | VexCL is a C vector expression template library | GPU library
kandi X-RAY | vexcl Summary
kandi X-RAY | vexcl Summary
VexCL is a vector expression template library for OpenCL/CUDA. It has been created for ease of GPGPU development with C++. VexCL strives to reduce amount of boilerplate code needed to develop GPGPU applications. The library provides convenient and intuitive notation for vector arithmetic, reduction, sparse matrix-vector products, etc. Multi-device and even multi-platform computations are supported. The source code of the library is distributed under very permissive MIT license. See VexCL documentation at
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of vexcl
vexcl Key Features
vexcl Examples and Code Snippets
Community Discussions
Trending Discussions on vexcl
QUESTION
Generating Gaussian random numbers using numpy turns out to be the bottleneck in my monte carlo simulation where I make heavy use of PyOpenCl.
...ANSWER
Answered 2020-Mar-06 at 17:26Seems like pyopencl already includes a random number generator:
https://github.com/inducer/pyopencl/blob/master/pyopencl/clrandom.py
Running a simple test shows, that the mean and standard deviation are comparable to numpy's implementation. Also the histogram corresponds to the normal distribution with negligible mean squared error.
Does anyone know further tests to check the quality of the random number generator?
Edit: According to https://documen.tician.de/pyopencl/array.html
QUESTION
My question is related to the tutorial which explains how to implement boost::odeint with VexCL in order to achieve concurrency (the complete code can be found here).
The following figure shows how I think of the iterations of ODEINT:
Now I ask myself, what exactly / or which part of it is parallelised in VexCL?
My impression is, the ODE part is one single task, as all equations of ODE are within one block in the given example. Maybe the integration part runs in three parallel tasks. This results in four tasks, where (I think) the ODE task is a bottle neck (because the equations can become very large).
If this is right I would like to know, how to improve this concurrency. I think it make sense to combine ODE and INT horizontally. This results in 3 tasks, each of which cannot be further reduced at this level.
...ANSWER
Answered 2020-Feb-12 at 06:19The example you linked to is doing a parameter study of the Lorenz system. That is, it solves a big number of the same equations with different parameters. The state type is vex::multivector
, which packs together states (3D coordinates) of many Lorenz systems. This is an embarrassingly parallel problem and one can apply the odeint algorithm to the state types in lock-step. That is, operations like x += tau * dt
where x
and dt
are large vectors, are performed on a GPU.
More details about odeint/vexcl implementation may be found in [1]. [2] is an interesting paper about how to extract parallelism in the case of coupled systems.
[1] Ahnert, Karsten, Denis Demidov, and Mario Mulansky. "Solving ordinary differential equations on GPUs." Numerical Computations with GPUs. Springer, Cham, 2014. 125-157. https://doi.org/10.1007/978-3-319-06548-9_7 (pdf)
[2] Mulansky, Mario. "Optimizing Large-Scale ODE Simulations." arXiv preprint arXiv:1412.0544 (2014).
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install vexcl
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page