Science Is Just Matrix Multiplication

Really, so do it wisely!
Photo by Mick Haupt on Unsplash

Recently, I was a guest at a seminar, where one Ph.D. student gave a presentation about some topic in physics. That student proudly claimed that he had managed to speed up his code by a factor of 1000! For months, he had struggled with excessive calculation times for his simulation program, and now these troubles have suddenly disappeared by his latest discovery. I was quite amused because his solution was more than obvious for the older and more experienced people like me and he could have simply asked and told us about the problem. At least, I thought so, but it turned out that even for the senior physicists in the room, that solution was new. I don’t say that I’m smarter than they (the opposite is probably true), but obviously, what I learned on my own path is different to what other people learned on their paths. So, I think it makes sense to share that “discovery” with you, since I don’t want you to struggle with the same issue if there is an easy and ready-to-go solution. The “discovery” that the student was talking about: sparse matrices.

Sparse Matrices

There is a saying in science: in the end, everything boils down to matrix multiplication. And that’s probably true. No matter if you are actually solving partial differential equations in physics or engineering, or if you are doing machine learning with classical models or deep neural networks, in the end, numerically, it’s all multiplying matrices and vectors, repetitively, in some order. These matrices can often be very big, say 1000,000 x 1000, 000. In that case, if your matrix has no certain structure, then you really have to store a lot of values in memory, and also arithmetic is very time-consuming. This is called a dense matrix. However, in very many scientific applications, the matrices in question do have a certain structure. Very often, most of the values of the matrix are zero. If that is the case, this is called a sparse matrix. You can imagine, the need for memory decreases enormously. For instance, if your 1000,000 x 1000,000 matrix is diagonal, you have to store 10⁶ values instead of 10¹² values. And if you multiply it with a 10⁶ dimensional vector, you just have to do 10⁶ multiplications instead of 10¹² multiplications and additions. You see, the difference is huge! Even if your matrix is not diagonal, as long as most of the values are zero, the memory and speed advantage is enormous, because you have to save only relatively few non-zero values and multiplication with vectors only has to take into account those non-zero values.

How to Use Sparse Matrices

Python’s SciPy package has an excellent sub-package dedicated to sparse matrix implementations. Depending on what structure your sparse matrix has and depending on what you want to do with the sparse matrix, there are different representations in the package that are optimized for the given task. But don’t worry about making the right choice! As long as you use any of the sparse matrix classes, you usually get 99% of the speed and memory benefit.

First, consider a big dense matrix 𝐴. We fill it with random values, all between -0.5 and 0.5:

import numpy as npn = 10000
A = np.random.random((n, n)) - 0.5

I’m only using 10000 rows and columns since I don’t want to wait too long. ;-) Then let’s multiply 𝐴⋅𝐴 using numpy’s matmul function.

np.matmul(A, A)

This takes 7.07 seconds on my computer. You could also use numpy’s dot function, but I'm not sure if it is optimized in the same way as matmul .

Now let’s use a diagonal matrix in the same way and store it as a normal matrix:

diag_values = np.random.random(n)
A_dense = np.diag(diag_values)
AA_dense = np.matmul(A_dense, A_dense)

This took 7.00 seconds on my computer, practically the same time, although we know that most of the multiplications must be zero. Now let’s do the same with sparse matrices. Since the matrix here is diagonal, we can use scipy.sparse.diags :

import scipy.sparse
A_sparse = scipy.sparse.diags(diag_values)

If you print A_sparse , it says:

<10000x10000 sparse matrix of type '<class 'numpy.float64'>'
	with 10000 stored elements (1 diagonals) in DIAgonal format>

so it really is a 10000x10000 matrix, but sparse and stored in a special, diagonal format. Let’s do the same multiplication again:

AA_sparse = A_sparse.dot(A_sparse)

That took 0.003 seconds on my computer, more than 2000 times faster than with the dense matrix. Not bad, right? To be sure, is the result the same?

np.max(np.abs(AA_dense - AA_sparse))
Out: 0.0

So yes, it is.

As I said, there are different kinds of sparse matrix representations in scipy.sparse . Each has its own strengths and weaknesses. For instance, with the lil sparse matrix type, slicing the matrix is fast, and changing the structure, like adding elements, is also fast. But arithmetic operations like addition and multiplication are slow. On the other hand, the csr_matrix type is fast for arithmetic operations but slow for slicing and changing values. So it really depends on what you want to do. Usually, you have to construct the matrix somehow and later want to use it for computation. The first task involves changing the matrix structure and the second involves arithmetic. So what do you do? Well, use two different sparse matrix types. First, use one optimized for construction and then convert it once to a type optimized for arithmetic! You just have to know, which is which. Fortunately, the SciPy documentation is excellent. Each sparse matrix type has its own doc-page with the advantages, disadvantages, and intended usage described. Check out this link to get started.