matmul | Benchmarking matrix multiplication implementations | Math library

by attractivechaos C Version: Current License: No License

X-Ray Key Features Code Snippets(3)Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | matmul Summary

matmul is a C library typically used in Utilities, Math applications. matmul has no bugs, it has no vulnerabilities and it has low support. You can download it from GitHub.

This repo evaluates different matrix multiplication implementations given two large square matrices (2000-by-2000 in the following example):.

Support

Quality

Security

License

Reuse

Support

matmul has a low active ecosystem.

It has 83 star(s) with 27 fork(s). There are 9 watchers for this library.

It had no major release in the last 6 months.

There are 2 open issues and 0 have been closed. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of matmul is current.

Quality

matmul has no bugs reported.

Security

matmul has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

matmul does not have a standard license declared.

Check the repository for any license declaration and review the terms closely.

Without a license, all rights are reserved, and you cannot use the library in your applications.

Reuse

matmul releases are not available. You will need to build from source code and install.

Installation instructions are not available. Examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of matmul

Get all kandi verified functions for this library.

matmul Key Features

No Key Features are available at this moment for matmul.

matmul Examples and Code Snippets

Perform matmul .

python

Lines of Code : 62

License : Non-SPDX (Apache License 2.0)

Copy

def _matmul_3d_with_map_fn(a, b, **kwargs):
  """Multiplies batches of 2D matrices using map_fn.

  `output[n, i, k]` = sum_j (a[n, i, j] * b[n, j, k])` (for all `n`, `i`, `k`).

  Requires that `a[n, i].nrows()` == `b[n].nrows()` (for all `n` and `i

python

Lines of Code : 23

License : Non-SPDX (Apache License 2.0)

Copy

def __call__(self, matmul_fn):
    """Perform the Matmul registration.

    Args:
      matmul_fn: The function to use for the Matmul.

    Returns:
      matmul_fn

    Raises:
      TypeError: if matmul_fn is not a callable.
      ValueError: if a

Creates a scaled matmul .

python

Lines of Code : 15

License : Non-SPDX (Apache License 2.0)

Copy

def create_large_matmul_savedmodel(out_dir):
  """Create a SavedModel that performs a large matmul."""
  root = autotrackable.AutoTrackable()
  root.f = def_function.function(
      lambda x, y: math_ops.matmul(x, y),  # pylint: disable=unnecessary-l

Community Discussions

Trending Discussions on matmul

Multiply a 3d tensor with a 2d matrix using torch.matmul

what if the size of training set is not the integer multiple of batch size

Why is the GNU scientific library matrix multiplication slower than numpy.matmul?

Change in Keras.applications source code results in error in missing variable from localhost

Pytorch TypeError: forward() takes 2 positional arguments but 4 were given

ValueError: Dimensions must be equal, but are 512 and 1024

Calculate `dot` product of image and vector

PyTorch: Error>> expected scalar type float but found double

Problem with this "minimalistic" python packaging that has an import in source code

Performance issue with Scipy's solve_bvp and coupled differential equations

QUESTION

Multiply a 3d tensor with a 2d matrix using torch.matmul

Asked 2021-Jun-09 at 15:48

I have two tensors in PyTorch, z is a 3d tensor of shape (n_samples, n_features, n_views) in which n_samples is the number of samples in the dataset, n_features is the number of features for each sample, and n_views is the number of different views that describe the same (n_samples, n_features) feature matrix, but with other values.

I have another 2d tensor b, of shape (n_samples, n_views), which purpose is to rescale all the features of the samples across the different views. In other words, it encapsulates the importance of the features of each view for the same sample. For example:

...

ANSWER

Answered 2021-Jun-09 at 15:48

Yes that's possible. If you have mutiple batch dimensions in both operatns, you can use the broadcasting. In this case the last two dimensions of each operand are interpreted as a matrix size. (I recommend looking it up in the documentation.)

So you need an additional dimension for your vectors b, to make them a n x 1 "matrix" (column vector):

Source https://stackoverflow.com/questions/67872716

QUESTION

what if the size of training set is not the integer multiple of batch size

Asked 2021-Jun-09 at 05:18

I am running the following code against the dataset of PV_Elec_Gas3.csv, the network architecture is designed as follows

...

ANSWER

Answered 2021-Jun-09 at 05:18

NO!!!!

In your forward method you x.view(-1) before passing it to a nn.Linear layer. This "flattens" not only the spatial dimensions on x, but also the batch dimension! You basically mix together all samples in the batch, making your model dependant on the batch size and in general making the predictions depend on the batch as a whole rather than on the individual data points.

Instead, you should:

Source https://stackoverflow.com/questions/67896539

QUESTION

Why is the GNU scientific library matrix multiplication slower than numpy.matmul?

Asked 2021-Jun-06 at 19:52

Why is it that the matrix multiplication with Numpy is much faster than gsl_blas_sgemm from GSL, for instance:

...

ANSWER

Answered 2021-Jun-06 at 19:52

TL;DR: the C++ code and Numpy do not use the same matrix-multiplication library.

The matrix multiplication of the GSL library is not optimized. On my machine, it runs sequentially, does not use SIMD instructions (SSE/AVX), does not efficiently unroll the loops to perform register tiling. I also suspect it also does not use the CPU cache efficiently due to the lack of tiling. These optimizations are critical to achieve high-performance and widely used in fast linear algebra libraries.

Numpy uses a BLAS library installed on your machine. On many Linux platform, its uses OpenBLAS or the Intel MKL. Both are very fast (they use all the methods described above) and should run in parallel.

You can find which implementation of BLAS is used by Numpy here. On my Linux machine, Numpy use by default CBLAS which internally use OpenBLAS (OpenBLAS is strangely not directly detected by Numpy).

There are many fast parallel BLAS implementations (GotoBLAS, ATLAS, BLIS, etc.). The open-source BLIS library is great because its matrix multiplication is very fast on many different architectures.

As a result, the simplest way to improve your C++ code is to use the cblas_sgemm CBLAS function and link a fast BLAS library like OpenBLAS or BLIS for example.

For more information:

One simple way to see how bad the GSL perform is to use a profiler (like perf on Linux or VTune on Windows). In your case Linux perf, report that >99% of the time is spent in libgslcblas.so (ie. the GSL library). More specifically, most of the execution time is spent in this following assembly loop:

Source https://stackoverflow.com/questions/67549023

QUESTION

Change in Keras.applications source code results in error in missing variable from localhost

Asked 2021-Jun-02 at 08:49

For image clustering I was using a piece of code which worked perfectly.

...

ANSWER

Answered 2021-Jun-02 at 08:49

I switched to TF2 instead of disabling v2 behavior and that has resolved the problem

Source https://stackoverflow.com/questions/67789714

QUESTION

Pytorch TypeError: forward() takes 2 positional arguments but 4 were given

Asked 2021-Jun-02 at 07:01

from torch.nn.parameter import Parameter
from torch.nn.modules.module import Module
class Graphconvlayer(nn.Module):
  def __init__(self,adj,input_feature_neurons,output_neurons):
    super(Graphconvlayer, self).__init__()
    self.adj=adj
    self.input_feature_neurons=input_feature_neurons
    self.output_neurons=output_neurons
    self.weights=Parameter(torch.normal(mean=0.0,std=torch.ones(input_feature_neurons,output_neurons)))
    self.bias=Parameter(torch.normal(mean=0.0,std=torch.ones(input_feature_neurons)))
  
  def forward(self,inputfeaturedata):
    output1= torch.mm(self.adj,inputfeaturedata)
    print(output1.shape)
    print(self.weights.shape)
    print(self.bias.shape)
    output2= torch.matmul(output1,self.weights.t())+ self.bias
    return output2 

class GCN(nn.Module):
   def __init__(self,lr,dropoutvalue,adjmatrix,inputneurons,hidden,outputneurons):
     super(GCN, self).__init__()
     self.lr=lr
     self.dropoutvalue=dropoutvalue
     self.adjmatrix=adjmatrix
     self.inputneurons=inputneurons
     self.hidden=hidden
     self.outputneurons=outputneurons
     self.gcn1 = Graphconvlayer(adjmatrix,inputneurons,hidden)
     self.gcn2 = Graphconvlayer(adjmatrix,hidden,outputneurons)
  
   def forward(self,x,adj):
     x= F.relu(self.gcn1(adj,x,64))
     x= F.dropout(x,self.dropoutvalue)
     x= self.gcn2(adj,x,7)
     return F.log_softmax(x,dim=1)

a=GCN(lr=0.001,dropoutvalue=0.5,adjmatrix=adj,inputneurons=features.shape[1],hidden=64,outputneurons=7)
a.forward(adj,features)

...

ANSWER

Answered 2021-Jun-02 at 07:01

Your GCN is composed of two Graphconvlayer.
As defined in the code you posted, Graphconvlayer's forward method expects only one input argument: inputfeaturedata. However, when GCN calls self.gcn1 or self.gcn2 (in its forward method) it passes 3 arguments: self.gcn1(adj,x,64) and self.gcn2(adj,x,7).
Hence, instead of a single input argument, self.gcn1 and self.gcn2 are receiving 3 -- this is the error you are getting.

Source https://stackoverflow.com/questions/67800090

QUESTION

ValueError: Dimensions must be equal, but are 512 and 1024

Asked 2021-May-31 at 08:31

I'm trying to build a simple auto encoder model (the input come from cfar10).

...

ANSWER

Answered 2021-May-31 at 08:31

I think in the second last line , instead of

Source https://stackoverflow.com/questions/67770157

QUESTION

Calculate `dot` product of image and vector

Asked 2021-May-25 at 10:48

I would like to calculate something like dot product of vector and image with shapes:

(3)
(3,1080,1080)

and the output should be (1,1080,1080)

...

ANSWER

Answered 2021-May-24 at 21:47

To modify as little of your sample as possible:

Source https://stackoverflow.com/questions/67678904

QUESTION

PyTorch: Error>> expected scalar type float but found double

Asked 2021-May-22 at 23:36

I've just started using pytorch and I am trying a simple multi-layer perceptron . My ReLU Activation Function is the following:

...

ANSWER

Answered 2021-May-22 at 04:29

The issue is not on result, it's either on X, W_ih, or torch.where(outputs > 0, outputs, 0.).

If you don't set an argument for the dtype of torch.rand(), it will assign the dtype based on the pytorch's global default value.

The global variable can be changed using torch.set_default_tensor_type().

Or go the easy route:

Source https://stackoverflow.com/questions/67645837

QUESTION

Problem with this "minimalistic" python packaging that has an import in source code

Asked 2021-May-13 at 14:50

I'm not a programmer, and my audience/users are not programmers either. So I'm trying to have the most minimalistic setup for my python package. I liked this structure below, which is endorsed in this video:

...

ANSWER

Answered 2021-May-13 at 14:50

I'm under the impression that this was an installation error of some sort. When I did a new environment and reinstalled everything, I was able to call myclass without error using from mypackage import myclass

Source https://stackoverflow.com/questions/67339687

QUESTION

Performance issue with Scipy's solve_bvp and coupled differential equations

Asked 2021-May-08 at 10:01

I'm facing a problem while trying to implement the coupled differential equation below (also known as single-mode coupling equation) in Python 3.8.3. As for the solver, I am using Scipy's function scipy.integrate.solve_bvp, whose documentation can be read here. I want to solve the equations in the complex domain, for different values of the propagation axis (z) and different values of beta (beta_analysis).

The problem is that it is extremely slow (not manageable) compared with an equivalent implementation in Matlab using the functions bvp4c, bvpinit and bvpset. Evaluating the first few iterations of both executions, they return the same result, except for the resulting mesh which is a lot greater in the case of Scipy. The mesh sometimes even saturates to the maximum value.

The equation to be solved is shown here below, along with the boundary conditions function.

...

ANSWER

Answered 2021-May-08 at 10:01

Based on semi-random inputs, we can see that max_mesh is sometimes reached. This means that coupling_equation can be called with a quite big z_mesh and a arrays. The problem is that coupling_equation contains a slow pure-Python loop iterating on each column of the arrays. You can speed the computation up a lot using Numpy vectorization. Here is an implementation:

Source https://stackoverflow.com/questions/67400917

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install matmul

You can download it from GitHub.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: