Numerical Derivatives Done Right

Photo by Joshua Sortino on Unsplash

Recently I have found this article giving a nice first introduction of how to do differentiation numerically. But can you do it with more accuracy and in higher dimensions? Texts on the more general case are hardly ever found on the internet, although it's actually quite easy. So I decided to go beyond the introduction level and write about how it is done in general. This is not just some academic exercise. No! The simple formulas at the introductory level almost never suffice to solve even simple practicle problems. That's what I will write about in the following section. In the second section, I show how to do better and in the last section I show sample Python code for solving the general case.

The easy cases and what's problematic about them

The "easy cases" for first and second derivatives like in most tutorials are usual derived from geometrical intuition. The introductory formulas read

$$ f'_n = \frac{f_{n+1}- f_{n-1}}{2h} $$

for the first derivative \(f'\) evaluated at \(x_n\). \(h\) is the spacing between two grid points on the \(x\)-axis and \(f_n\) is shorthand for the value of the function \(f\) evaluated at the point \(x_n\).

For the second derivative we are taught that it is approximately

$$ f''_n = \frac{f_{n+1}-2f_n + f_{n-1}}{h^2} $$

There are two things which are problematic about these formulas. First, although their accuracy gets better for finer grids (i.e. smaller \(h\)), sometimes you are restricted to coarse grids, so the approximation is pretty bad. It would be nice to have formulas which an inherently more accurate. Second, what do you do at the "boundary"? Your discrete function values \(f_n\) are only given at a finite number of grids points, say from \(x_0\) to \(x_N\). But if you try to calculate the derivative at \(x_0\), the formula above tells you that you need \(f_{-1}\) to do that. But you don't have \(f_{-1}\), because that is outside of your grid! As you can see, in all but the most simple practical cases, you have to go beyond these absolute basic formulas! And I will show you next how to do that.

How to derive formulas systematically

In the following, we want to calculate the \(m-\)th derivative at some grid point \(x_n\). The grid is such that at \(x_n\) there are at least \(p\) further points to the left and at least \(q\) points to the right available for approximating derivatives. Note that is not a restriction, because \(p\) and \(q\) could be zero. If I say \(p=q=1\), I would make use of the function values \(f_{n-1}\), \(f_n\) and \(f_{n+1}\), like in the easy cases above. If I say \(p=0\), \(q=4\), I would use \(f_{n}\), \(f_{n+1}\), \(f_{n+2}\), \(f_{n+3}\) and \(f_{n+4}\). You get the idea.

What we are trying to find is a linear combination of the discrete function values that approximates the \(m-\)th derivative \(f^{(m)}\) at \(x_n\) as good as possible:

$$ f^{(m)}_n \approx \sum_{k=-p}^q c_k f_{n+k} $$

The \(c_k\) are called finite difference coefficients. Note that we have \(f_{n+k} = f(x_n+k h)\) in the ansatz. This is already a hint what to do next. We simply insert the Taylor expansion

$$ f_{n+k} = f(x_n+kh) = f_n + f'_n h + \frac{1}{2!}f''_n + \ldots\\ = \sum_{i=0}^\infty \frac{f_n^{(i)}}{i!} (kh)^i $$

(where I denoted \(f_n\) by the zeroth derivative \(f^{(0)}\)) into the ansatz:

$$ f^{(m)}_n \approx \sum_{k=-p}^q c_k \sum_{i=0}^\infty \frac{f_n^{(i)}}{i!} (kh)^i\\ = \underbrace{ f_n^{(0)} \sum_{k=-p}^q c_k}_{i=0} + \underbrace{ h f_n^{(1)} \sum_{k=-p}^q c_k k }_{i=1} + \underbrace{ \frac{h^2}{2!} f^{(2)}\sum_{k=-p}^q c_k k^2 }_{i=2} + \ldots $$

To make this an equation we would have to choose the \(c_k\) such that all terms for \(i\ne m\) vanish and only the one with \(i=m\) is 1. In other words:

$$ \frac{h^i}{i!}\sum_{k=-p}^q c_k k^i = \left\{ \begin{matrix} 1 & \quad \mbox{if } i= m \\ 0 & \quad \mbox{else} \end{matrix} \right. $$

If you consider the \(c_k\) as components of a column vector and look closely, this is just a linear equation system:

$$ \left( \begin{matrix} 1 & 1 & 1 & \dots & 1 & 1 \\ -p & -p+1 & -p+2 & \dots & q-1 & q \\ (-p)^2 & (-p+1)^2 & (-p+2)^2 & \dots & (q-1)^2 & q^2 \\ \dots & \dots & \dots & \dots & \dots & \dots \\ \dots & \dots & \dots & \dots & \dots & \dots \\ (-p)^{p+q} & (-p+1)^{p+q} & (-p+2)^{p+q} & \dots & (q-1)^{p+q} & q^{p+q} \\ \end{matrix} \right) \left( \begin{matrix} c_{-p} \\ c_{-p+1}\\ c_{-p+2}\\ \dots \\ c_{q-1}\\ c_q \end{matrix} \right) = \frac{m!}{h^m} \left( \begin{matrix} \delta_{0, m} \\ \delta_{1, m}\\ \delta_{2, m}\\ \dots \\ \delta_{p+q-1, m}\\ \delta_{p+q, m} \end{matrix} \right) $$

where I the \(\delta\) on the right hand side are Kronecker deltas:

$$ \delta_{ij} = \left\{ \begin{matrix} 1&\quad \mbox{if } i=j\\ 0&\quad \mbox{else }\\ \end{matrix} \right. $$

Solving this equation system, gives us the \(c_k\) and therefore allows us to compute the \(m-\)th derivative. If you want higher accuracy, just use higher values of \(p\) and \(q\) (in other words use more grids points) and solve the corresponding equation system again.

Application with Python

python

import math
from sympy import Matrix, linsolve

def calc_coefficients(deriv, p, q):
    num_coefs = p + q + 1
    
    # The RHS of the equation system
    b = [0] * num_coefs
    b[deriv] = math.factorial(deriv)
    b = Matrix(b)
    
    # The LHS of the equation system
    
    A = [[1] * num_coefs] # only first line
    offsets = list(range(-p, q + 1))
    for i in range(1, len(offsets)):
        A.append([j**i for j in offsets])
    A = Matrix(A)
                   
    coefs = linsolve((A, b)) # note the additional braces!!
    return coefs    
    

import math
from sympy import Matrix, linsolve

def calc_coefficients(deriv, p, q):
    num_coefs = p + q + 1
    
    # The RHS of the equation system
    b = [0] * num_coefs
    b[deriv] = math.factorial(deriv)
    b = Matrix(b)
    
    # The LHS of the equation system
    
    A = [[1] * num_coefs] # only first line
    offsets = list(range(-p, q + 1))
    for i in range(1, len(offsets)):
        A.append([j**i for j in offsets])
    A = Matrix(A)
                   
    coefs = linsolve((A, b)) # note the additional braces!!
    return coefs  

calc_coefficients(2, 1, 1)
$$ \left\{\left( 1, \ -2, \ 1\right)\right\} $$

calc_coefficients(1, 1, 1)
$$ \left\{\left( - \frac{1}{2}, \ 0, \ \frac{1}{2}\right)\right\} $$

calc_coefficients(2, 6, 0)
$$ \left\{\left( \frac{137}{180}, \ - \frac{27}{5}, \ \frac{33}{2}, \ - \frac{254}{9}, \ \frac{117}{4}, \ - \frac{87}{5}, \ \frac{203}{45}\right)\right\} $$

calc_coefficients(4, 10, 10)
$$ \left\{\left( \frac{514639}{257297040000}, \ - \frac{11419}{231567336}, \ \frac{487121}{823350528}, \ - \frac{809}{175032}, \ \frac{1933049}{72648576}, \ - \frac{9587629}{78828750}, \ \frac{1888949}{4036032}, \ - \frac{1827209}{1135134}, \ \frac{1650809}{310464}, \ - \frac{698249}{58212}, \ \frac{307869749}{19440000}, \ - \frac{698249}{58212}, \ \frac{1650809}{310464}, \ - \frac{1827209}{1135134}, \ \frac{1888949}{4036032}, \ - \frac{9587629}{78828750}, \ \frac{1933049}{72648576}, \ - \frac{809}{175032}, \ \frac{487121}{823350528}, \ - \frac{11419}{231567336}, \ \frac{514639}{257297040000}\right)\right\} $$