Polynomials Are Not What You Think They Are
At school, and also often as undergraduates at college, we learn that polynomials are things of the form
where both the coefficients a_n and the “unknown” x are numbers. But actually, that is not quite accurate. To be precise, this notion is actually called a polynomial function , but not a polynomial! In fact, the definition of polynomials is different. And if I tell you what it really is, you will wonder: what’s the difference? At first glance, the difference is only small. But at a second glance, it’s a huge conceptual difference. One, which opens up the possibility for fancy topics like field extensions and Galois theory. In this post, we will talk about what polynomials really are and also how you can work with them in Python.
So, here is the actual definition of a polynomial. A polynomial is any expression of the form
where the coefficients a_i are all elements of some ring A and x is some symbol .
Note that x is not required to be an element of A, nor does it have to be an element of any other ring. At this point, it is just a symbol.
To be even more precise, A is required to be a commutative ring with a unity element. What does that mean? Well, a ring is a set with two operations. The one operation “+” behaves like a regular addition, and the other operation “*” behaves like a multiplication. Like for ordinary addition, addition in general rings is associative (a+(b+c)=(a+b)+c), commutative (a+b=b+a), there is a neutral element (a+0=a) and each element has a negative (a+(−a)=0). For multiplication, rings are just required to be associative. Multiplication is not required to be commutative, but some rings are. Then they are called commutative rings. Also, rings in general do not have to have inverses and not even a neutral element 1. But some rings do have a neutral element with respect to multiplication (a⋅1=a). Those rings are called ring with unity. And if they have inverses, too, then these rings are called a field.
These were a lot of buzz words. So here are a few examples. The real numbers are a field because they obey the rules for general rings, but also have a 1, their multiplication is commutative and they have inverses a→1/a, a≠0. Integer numbers, on the other hand, are not fields, because there is no inverse integer of, say, the number 2. But they form a commutative ring with unity. Another example are square matrices. They form a non-commutative ring (ab≠ba) with unity.
Ok, so in our definition of polynomials we require the coefficients to be elements of some commutative ring with unity. But some given polynomial is not an element of that ring. It is just some expression. However, if you substitute a value for x, then it becomes an element of… well.. depends. It could be that A is the ring of real matrices and you could choose x from something completely different, like the ring (even field) of complex numbers. That would make the substitute value an element of the complex matrices. But the point is: the polynomial itself is the thing before being substituted, so it is something very own.
The set of all polynomials with coefficients from the ring A is denoted A[x]. At this point, this is just a set. But we can define operations on that set. Specifically, we can define an addition of polynomials a(x) and b(x) :
so we just add the coefficients term by term.
For multiplication we also define what we are used to from our high school experience with polynomials:
The coefficient for the term with x^k is in general
With these two operations defined, the set A[x] of polynomials on A forms a ring all by itself!! What’s intriguing is that this ring has a lot of common with integer numbers. That’s something quite unexpected, isn’t it?
For example, for integer numbers a and b we can always find unique integers q and r, so that division splits into a unique whole part and a unique remainder:
with 0≤r<b. This holds for integer numbers, but for real numbers is makes no sense.
But like for integer numbers, the equivalent of that theorem holds for polynomials. So if a(x) and b(x) are polynomials from A[x], then again there are unique polynomials q(x) and r(x) with
where either r(x)=0 is the zero polynomial or the degree (highest non-zero power of x) is less than the degree of b(x).
That’s the mathematical justification for the polynomial division we learned at school. But this theorem says, that it not only works for what we were used to consider polynomials. Remember, A can be any ring. So we can do polynomial division for polynomials over, say, the integers modulo 3!
That’s all nice and well, but you wonder, what’s the big deal about polynomials. The thing is, they automatically arise when you want to extend a given field. Before we look at a concrete historical example, lets consider some the some ring A (could be anything, not only numbers) and enlarge that ring by the number π. But the extended result must still be a ring, so any combination of elements from addition and multiplication must still be part of the ring. Any combination no matter how often means:
So any element of the extended ring can be written in the form above. In fact, this is just the result of substituting π for the symbol in a polynomial expression. In other words: polynomials generate field extensions. And to which field you extend depends on what you substitute.
For instance, consider the real numbers. Of course, there is no real solution to
But as we all now, there are two complex solutions ±i. So the field of real numbers (remember, it is also a ring) is not sufficient to solve this equation. What mathematicians did was to enlarge the field of real numbers by one single element: the square root of -1, called i. So any element of this extended field must have the form
where the a_i are now real numbers. But we had defined our symbol i by
so this connects our symbol with our underlying field. The choice of this connection was of course done in a way to solve a problem for us: solving the quadratics. So in our expression above we can replace all higher powers of i by either -1, 1, i or −i. This means that any such number can be written as a simple expression of the form
where x and y are real numbers. And these are, of course, all the complex numbers.
So much for the background. Now, how can you handle polynomials in Python. And for concreteness, let’s find the factor q(x) and remainder r(x) for the following polynomials over Z_n, the ring of integers modulo 3:
And here is, how we can solve this problem with Python’s sympy package:
from sympy import symbols, div, Poly
from sympy.polys.domains import ZZ
# Define the variable
x = symbols('x')
# Define the polynomials over the integers
a = Poly(x**4 + 2*x**3 + 2*x**2 + x + 1, domain=ZZ)
b = Poly(x**2 + x + 1, domain=ZZ)
# Perform the division
quotient, remainder = div(a, b, domain=ZZ)
# Convert the coefficients of quotient and remainder polynomials to modulo 3
quotient_mod = Poly([coef % 3 for coef in quotient.all_coeffs()], x, domain=ZZ)
remainder_mod = Poly([coef % 3 for coef in remainder.all_coeffs()], x, domain=ZZ)
Here, we first defined our symbol and then the polynomials in Z[x], not over Z_3[x]. Then we did polynomial division using the
div
function, and after that we converted the result back to Z_3[x].
By now, we have scratched the surface of the mathematical concept of polynomials, and more importantly, how they aren’t exactly what we thought they were back in our early education years. Through the lens of abstract algebra, we dove into some intricacies of these mathematical entities and how they form the backbone of more advanced topics like field extensions.
We also explored how these abstract concepts can be put into action using Python, demonstrating how these theories are not just abstract but have practical applications in computing and coding.
Understanding the true nature of polynomials is not only an intellectual exercise but also a journey into the elegance and depth of mathematics. It showcases the interconnectedness of different branches of mathematics and gives us a glimpse of the unifying structures that underpin them.
For those eager to learn more, I recommend delving into abstract algebra textbooks, which provide comprehensive coverage of these topics. My favorite one is Charles Pinter’s “ A Book of Abstract Algebra ”. Easy to read yet rigorous, and it covers all the interesting topics from most graduate courses.