How To Create Differentiation Matrices
Suppose we have function 𝑓 given on an equidistant grid
and we want to compute the derivative of that function on the same grid. Last week, I have written some posts on how to do that with the Fast Fourier Transform. Today, I will do the same with matrices and interpolating polynomials instead.
In order to derive a scheme for numerical differentiation we will use the following approach. To compute the derivative at some grid point 𝑥_𝑘, we will use the point itself and its neighboring points, say, 2 to the left and 2 to the right:
Then we find the unique polynomial through those points using Lagrange’s method that I described in this article . Having the interpolating polynomial, we can simply differentate that analytically and substitute our grid points in the end. That’s the basic idea, now here is the actual implementation using Python’s computer algebra system sympy . First let’s define our symbols
so that we have
or
Now we make a little function to do the polynomial interpolation using Lagrange’s method described in my recent article :
In our case, we get our interpolant 𝑝(𝑥) by
The derivative of the interpolant is
We want to evaluate that at an arbitrary grid point, so we substitute for 𝑥:
and we also substitute our grid points:
So in this approximation, the derivative on the grid is
If you carry out the same procedure above just replacing
offsets = [-1, 0, 1]
by
offsets = [-2, -1, 0, 1, 2]
then the output of the intermediate steps is a bit messy, but the end result for the derivative on the grid is
Using 5 points for interpolation, we have a polynomial of degree 4 as interpolant, so our accuracy for the discretized derivative will also be of order O(Δ𝑥⁴).
As a next step, we collect all grid points in column vectors:
and our goal is to write the corresponding vector for the derivative in terms of a matrix 𝐷, which describes the differentiation, times the vector with the grid points:
Ignoring the values for the first, second and two last rows for a moment, this reads
The only tricky point is what to do with the first and last rows, because obviously, the non-zero entries are just shifted to the left when going up a row in the matrix and at some point, non-zero values would start to drop out of the matrix. There are two options to handle this. The first is to compute corresponding coefficients which are not symmetric like about, but using one-sided offsets. The other, simpler option is to impose periodic boundary conditions. If we say that the function is periodic such that
for every 𝑘, then the coefficients simply “wrap around” and we get
What we get is a so-called circulant Toeplitz matrix. We have constant terms along the diagonals, the diagonals are circulant (wrap around) and most of the values are zero. An ideal situation for sparse matrix libraries which gives you tremendous (!) speed and memory advantage compared to a naive approach.
By the way, the restriction to periodic functions sounds like a severe limitation. However, in surprisingly many physical cases, this assumption is often fulfilled or at least approximately fulfilled.
Let’s now turn to the construction of the differentiation matrices in Python code. We’ll use the sparse matrix subpackage of
scipy
to efficiently store and apply the matrices. We construct the circulant by defining the function
sparse_circulant
, which gets the first column as an argument. The function
diff_matrix
is responsible for creating that first column.
import numpy as np
from scipy.sparse import spdiags
def sparse_circulant(column):
"""Create a circulant matrix in sparse matrix form based on the first column."""
nrows = len(column)
e = np.ones(nrows)
diags_data = []
diags_offsets = []
# Build the lower triangle of the matrix
for off, c in enumerate(column):
if c != 0:
diags_data.append(c * e)
diags_offsets.append(-off)
matrix = spdiags(diags_data, diags_offsets)
# The matrix is skew-symmetric, to we get the upper triangle easily:
return matrix - matrix.T
def diff_matrix(deriv, coefs, offsets, npoints, h):
"""Construct the differentiation matrix."""
column = np.zeros(npoints)
for off, coef in zip(offsets, coefs):
if off > 0:
column[npoints-off] = coef
else:
column[-off] = coef
return sparse_circulant(column) / h**deriv
Check that this is correct by applying this to a small grid with grid spacing 1:
# Change numpy output options for nice printing of semibig matrix
np.set_printoptions(edgeitems=30, linewidth=100000,
formatter=dict(float=lambda x: "%8.3f" % x))
N = 8 # number of grid points
h = 1 # grid spacing
diff_matrix(1, [-1./12, 8./12, -8./12, 1./12], [-2, -1, 1, 2], N, h).toarray()
array([[ 0.000, -0.667, 0.083, 0.000, 0.000, 0.000, -0.083, 0.667],
[ 0.667, 0.000, -0.667, 0.083, 0.000, 0.000, 0.000, -0.083],
[ -0.083, 0.667, 0.000, -0.667, 0.083, 0.000, 0.000, 0.000],
[ 0.000, -0.083, 0.667, 0.000, -0.667, 0.083, 0.000, 0.000],
[ 0.000, 0.000, -0.083, 0.667, 0.000, -0.667, 0.083, 0.000],
[ 0.000, 0.000, 0.000, -0.083, 0.667, 0.000, -0.667, 0.083],
[ 0.083, 0.000, 0.000, 0.000, -0.083, 0.667, 0.000, -0.667],
[ -0.667, 0.083, 0.000, 0.000, 0.000, -0.083, 0.667, 0.000]])
That looks ok. And now we apply it to a bigger and finer grid. As a use case, we will compute the maximum error for the derivative of the function
depending on the size 𝑁 of the grid:
Ns = np.logspace(1, 4, 20, dtype=int)
def calc_errors(coefs, offsets):
errs = []
deriv = 1 # first derivative
for N in Ns:
# Set up grid:
h = 2*np.pi / N
x = np.arange(1, N+1) * h
f = np.exp(np.sin(x))
# The exact derivative:
f1_ex = np.cos(x)*f
# The approximate derivative:
D = diff_matrix(deriv, coefs, offsets, N, h)
f1 = D.dot(f)
# Max. error on domain:
err = np.max(np.abs(f1 - f1_ex))
errs.append(err)
return errs
import matplotlib.pyplot as plt
plt.loglog(Ns, calc_errors(coefs=[-0.5, 0.5], offsets=[-1, 1]), 'o', label=r'$O(\Delta x^2)$')
plt.loglog(Ns, calc_errors(coefs=[1/12, -8/12, 8/12, -1/12], offsets=[-2, -1, 1, 2]), 'o', label=r'$O(\Delta x^4)$')
plt.grid()
plt.xlabel('$N$')
plt.ylabel('Error')
plt.legend()
As expected, the scheme with coefficients
has a slope of -2 in the double-log plot, so the error behaves like 1/Δ𝑥².
The scheme with the coefficients
however, has a slope of -4, so the error behaves like 1Δ𝑥⁴. So in order to achieve the same accuracy, one needs much less points with the higher order scheme. Taking more and more coefficients, one can do better and better and the sparse matrix slowly beings to turn into a dense matrix. But the penalty of having a denser matrix in matrix multiplication is beaten by the advantage that one needs less grid points to achieve the same accuracy.