xtensor-blas | BLAS extension to xtensor
kandi X-RAY | xtensor-blas Summary
kandi X-RAY | xtensor-blas Summary
xtensor-blas is an extension to the xtensor library, offering bindings to BLAS and LAPACK libraries through cxxblas and cxxlapack from the FLENS project. xtensor-blas currently provides non-broadcasting dot, norm (1- and 2-norm for vectors), inverse, solve, eig, cross, det, slogdet, matrix_rank, inv, cholesky, qr, svd in the xt::linalg namespace (check the corresponding xlinalg.hpp header for the function signatures). The functions, and signatures, are trying to be 1-to-1 equivalent to NumPy. Low-level functions to interface with BLAS or LAPACK with xtensor containers are also offered in the blas and lapack namespace.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of xtensor-blas
xtensor-blas Key Features
xtensor-blas Examples and Code Snippets
Community Discussions
Trending Discussions on xtensor-blas
QUESTION
I'm trying to use xtensor-blas for the first time. I've had lots of difficulties linking to it, but finally, I've done that and tried to run the sample programs. However, as the output, I get 0
for the first and 0, -inf
for the second.
I'm using Windows 10 x64
, Clion 2021.1
Installed cmake 3.19.7
, xtensor 0.23.4
, xtensor-blas 0.19.0
, openblas 0.3.13
, lapack 3.6.1
using anaconda
Compiled using Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\bin\HostX64\x64\cl.exe
ANSWER
Answered 2021-Apr-05 at 17:26The problem is with the example, that uses a singular matrix. Its determinant should be 0 as your code outputs correctly.
This case was discussed in a pull request for xtensor-blas, and the solution was to change the example.
QUESTION
Disclaimer: I'm a noob at building/make/packages/cmake.
My goal: Use xtensor-blas
library in C++
My env: Win10 x64, CLion2021
My problem: Can't get the simplest examples to compile. Sth about project dependencies.
I tried:
1) downloading and compiling openBLAST manually using every tutorial I could google - always stopped at different problems. Either I don't have "nmake" or build failed for some reason, or I get "undefined reference" etc. - I've been overwhelmed for a couple of days. A step-by-step walkthrough would be appreciated.
2) the closest I got was using anaconda conda install -c conda-forge openblas
, then copy-pasting "include" directories from xtl
,xtensor
,xtensor-blas
to my project. My CMakeLists.txt:
ANSWER
Answered 2021-Mar-31 at 08:28Disclaimer: I'm far from a Windows expert (I just use it in Continuous Integration for testing).
You should be able to use the target provided by xtensor-blas. So what should be possible is to do (on any platform):
QUESTION
I am a newbie in c++, and heard that libraries like eigen, blaze, Fastor and Xtensor with lazy-evaluation and simd are fast for vectorized operation.
I measured the time collapsed in some doing basic numeric operation by the following function:
(Fastor)
...ANSWER
Answered 2020-Oct-11 at 10:40The reason the Numpy implementation is much faster is that it does not compute the same thing as the two others.
Indeed, the python version does not read z
in the expression np.sin(x) * np.cos(x)
. As a result, the Numba JIT is clever enough to execute the loop only once justifying a factor of 100 between Fastor and Numba. You can check that by replacing range(100)
by range(10000000000)
and observing the same timings.
Finally, XTensor is faster than Fastor in this benchmark as it seems to use its own fast SIMD implementation of exp/sin/cos while Fastor seems to use a scalar implementation from libm justifying the factor of 2 between XTensor and Fastor.
Answer to the update:
Fastor/Xtensor performs really bad in exp, sin, cos, which was surprising.
No. We cannot conclude that from the benchmark. What you are comparing is the ability of compilers to optimize your code. In this case, Numba is better than plain C++ compilers as it deals with a high-level SIMD-aware code while C++ compilers have to deals with a huge low-level template-based code coming from the Fastor/Xtensor libraries. Theoretically, I think that it should be possible for a C++ compiler to apply the same kind of high-level optimization than Numba, but it is just harder. Moreover, note that Numpy tends to create/allocate temporary arrays while Fastor/Xtensor should not.
In practice, Numba is faster because u
is a constant and so is exp(u)
, sin(u)
and cos(u)
. Thus, Numba precompute the expression (computed only once) and still perform the sum in the loop. The following code give the same timing:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install xtensor-blas
openblas
lapack
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page