A Primer in Graph Theory with Python

Photo by Clint Adair on Unsplash

Graph theory is a branch of mathematics and computer science that deals with the study of graphs, which are a collection of vertices and edges that connect them. Graphs are used to model relationships between objects or entities in a system, and are widely used in various fields including computer science, operations research, social sciences, and more. In this article, we will introduce some of the common types of graphs, such as circular graphs, utility graphs, and complete graphs. We will also discuss subgraphs, the degree of a vertex, adjacency matrices, cycles, and planarity. We will use Python’s networkx library, a powerful tool for working with graphs and creating graph diagrams, to illustrate and analyze these concepts.

If you haven’t already done so, install networkx by submitting

pip install networkx

to follow along this tutorial.

So let’s introduce a few common graphs. Circular graphs are a type of graph that can be drawn on a circle. They are often used to represent cyclic or periodic data. In a circular graph, each vertex is evenly spaced around the circle, and the edges connect adjacent vertices.

Here’s an example of a circular graph with 6 nodes and 6 edges, created using networkx and its the nx.cycle_graph() function to create the graph object. We then use the nx.draw_circular() function to draw the graph in a circular layout.

import networkx as nx
import matplotlib.pyplot as plt

# Create a circular graph object
G = nx.cycle_graph(6)

# Draw the circular graph
nx.draw_circular(G, with_labels=True)
plt.show()
The circular graph C5

Another common type of graph is the so-called utility graph. Utility graphs are often used to model situations where one set of objects provides a utility or benefit to the other set of objects. In a utility graph, each node represents an object from one set, and each edge represents a connection to an object in the other set.

Here’s an example of a utility graph with 5 nodes in each set:

import networkx as nx
import matplotlib.pyplot as plt

# Create a utility graph object
G = nx.Graph()

# Add nodes to the graph
G.add_nodes_from([1,2,3,4,5], bipartite=0)
G.add_nodes_from([6,7,8,9,10], bipartite=1)

# Add edges to the graph
for k in range(1, 6):
    for j in range(6, 11):
        G.add_edge(k, j)


# Draw the utility graph
pos = nx.bipartite_layout(G, [1,2,3,4,5])
nx.draw(G, pos, with_labels=True)
plt.show()
The utility graph U5,5

Here, we have used the bipartite option while adding nodes. The bipartite option in networkx is used to specify the set to which a node belongs in a bipartite graph. A bipartite graph is a type of graph where the nodes can be divided into two sets such that each edge connects a node in one set to a node in the other set. In other words, there are no edges between nodes in the same set.

When creating a bipartite graph in networkx, you can use the bipartite option to specify which set a node belongs to. This is useful when you want to perform bipartite graph algorithms on the graph, such as finding maximum matchings or performing network analysis on bipartite graphs.

As another example of a very common type of graph, consider the so-called complete graph. It is called complete because every pair of distinct nodes is connected by an edge. In other words, a complete graph is a graph where each node is directly connected to every other node in the graph.

A complete graph with n nodes is often denoted as Kn, where n is the number of nodes. The total number of edges in a complete graph with n nodes is n(n−1)/2.

Since it is so common, networkx provides a convenience function to generate this type of graph:

import networkx as nx
import matplotlib.pyplot as plt

# Create a complete graph object with 7 nodes
G = nx.complete_graph(7)

# Draw the complete graph
nx.draw_circular(G, with_labels=True)
plt.show()
The complete graph K7

Let’s assert that the number of edges is really n(n−1)/2. We get the edges by

G.edges
EdgeView([(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 3), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6), (4, 5), (4, 6), (5, 6)])

and the nodes by

G.nodes
NodeView((0, 1, 2, 3, 4, 5, 6))

so our assertion is

n = len(G.nodes)
len(G.edges) == n * (n-1) / 2
True

The complement of a graph is another graph that contains all the nodes that are not present in the original graph, but does not contain any of the edges that are present in the original graph. In other words, if G is a graph, then the complement of G is a graph that has the same set of nodes as G, but contains all the edges that are not present in G. The complement of a graph is often denoted as G′.

Here’s an example:

import networkx as nx
import matplotlib.pyplot as plt

# Create a graph object with 5 nodes and 3 edges
G = nx.Graph()
G.add_nodes_from([1,2,3,4,5])
G.add_edges_from([(1,2),(2,3),(4,5)])

# Create the complement graph
G_comp = nx.complement(G)

# Draw both graphs
plt.subplot(121)
nx.draw_circular(G, with_labels=True)
plt.title('Original Graph')
plt.subplot(122)
nx.draw_circular(G_comp, with_labels=True)
plt.title('Complement Graph')
plt.show()

A subgraph is a graph that is formed by taking a subset of the vertices and edges of an original graph. In other words, if G is a graph, then any graph H that is formed by taking a subset of the vertices and edges of G is a subgraph of G. A subgraph can be formed by removing one or more vertices or edges from the original graph, or by selecting a subset of the vertices and edges from the original graph.

Again, an example should make things more clear:

import networkx as nx
import matplotlib.pyplot as plt

# Create a graph object with 6 nodes and 7 edges
G = nx.Graph()
G.add_nodes_from([1,2,3,4,5,6])
G.add_edges_from([(1,2),(2,3),(3,4),(4,5),(5,6),(6,1),(2,4)])

# Create a subgraph by selecting a subset of the edges
H = G.edge_subgraph([(1,2),(2,3),(3,4),(4,5)])

# Draw both graphs
plt.subplot(121)
nx.draw_circular(G, with_labels=True)
plt.title('Original Graph')
plt.subplot(122)
nx.draw_circular(H, with_labels=True)
plt.title('Subgraph')
plt.show()

Here, we have created a subgraph by selecting a subset of the edges from the original graph using the edge_subgraph() function. The resulting subgraph contains only the nodes and edges that were selected from the original graph. We can see that in the subgraph diagram, only the selected edges and their corresponding nodes are present, while the other edges and nodes are absent.

There are some metrics that allow for a better analysis of graphs. One of the most important metrics is the degree. The degree of a vertex is the number of edges that are connected to the vertex. In other words, the degree of a vertex is the number of other vertices that are directly connected to that vertex.

In networkx, we can calculate the degree of a vertex using the degree() function.

import networkx as nx
import matplotlib.pyplot as plt

# Create a graph object with 6 nodes and 7 edges
G = nx.Graph()
G.add_nodes_from([1,2,3,4,5,6])
G.add_edges_from([(1,2),(2,3),(3,4),(4,5),(5,6),(6,1),(2,4)])

# Calculate the degree of each vertex
degrees = dict(G.degree())

# Print the degree of each vertex
for vertex, degree in degrees.items():
    print("Vertex", vertex, "has degree", degree)

# Draw the graph
nx.draw(G, with_labels=True)
plt.show()
Vertex 1 has degree 2
Vertex 2 has degree 3
Vertex 3 has degree 2
Vertex 4 has degree 3
Vertex 5 has degree 2
Vertex 6 has degree 2

As nice as a graphical representation is, sometimes we want something that is better suited for computation. Enter the adjacency matrix. The adjacency matrix is a square matrix that represents the connections between the nodes of a graph. In an adjacency matrix, the rows and columns of the matrix represent the nodes of the graph, and the entries in the matrix indicate whether there is an edge between each pair of nodes.

If G is a graph with n nodes, the adjacency matrix A of G is an n x n matrix such that A[i,j] is 1 if there is an edge from node i to node j, and 0 otherwise. If the graph is undirected, the adjacency matrix is symmetric, meaning that A[i,j] = A[j,i] for all i and j.

The adjacency matrix is a useful tool for analyzing the properties of a graph, as it allows us to quickly determine which nodes are connected to each other, and how many edges are present in the graph. The adjacency matrix can also be used to calculate various metrics of the graph, such as the degree sequence, the number of cycles, and the eigenvalues of the graph.

In networkx, we can generate the adjacency matrix of a graph using the nx.to_numpy_array() function. Here's an example of how to generate the adjacency matrix for a graph with 4 nodes and 4 edges:

import networkx as nx
import numpy as np

# Create a graph object with 4 nodes and 4 edges
G = nx.Graph()
G.add_nodes_from([1,2,3,4])
G.add_edges_from([(1,2),(2,3),(3,4),(4,1)])

# Generate the adjacency matrix
adj_mat = nx.to_numpy_array(G)

nx.draw_circular(G, with_labels=True)

# Print the adjacency matrix
print(adj_mat)
[[0. 1. 0. 1.]
 [1. 0. 1. 0.]
 [0. 1. 0. 1.]
 [1. 0. 1. 0.]]

From the output, we can see that the adjacency matrix is a 4 x 4 matrix, where each entry represents the presence or absence of an edge between two nodes. For example, the entry in the first row and second column is 1, indicating that there is an edge between nodes 1 and 2. And since we have an indirected graph, the matrix is symmetric.

A cycle in a graph is a path of edges that starts and ends at the same vertex, and that does not repeat any vertex or edge (except for the starting and ending vertex, which are the same). In other words, a cycle is a closed path that visits a sequence of nodes without visiting the same node twice, except for the starting and ending node.

Cycles are an important concept in graph theory, as they are used to study the properties of graphs and to develop algorithms for graph problems. For example, cycles can be used to detect whether a graph is bipartite or not, and to find the shortest path between two nodes in a graph.

In a directed graph, a cycle is called a directed cycle or a circuit, and it is defined as a path of directed edges that starts and ends at the same vertex, and that does not repeat any vertex or edge (except for the starting and ending vertex, which are the same).

Cycles can be found using various algorithms, such as depth-first search or breadth-first search. In networkx, we can find cycles in a graph using the nx.find_cycle function.

import networkx as nx

# Create a graph object with 6 nodes and 7 edges
G = nx.Graph()
G.add_nodes_from([1,2,3,4,5,6])
G.add_edges_from([(1,2),(2,3),(3,4),(4,5),(5,6),(6,1),(2,4)])

# Find cycles in the graph
cycles = nx.cycle_basis(G)

nx.draw_circular(G, with_labels=True)

# Print the cycles
print(cycles)
[[2, 4, 5, 6, 1], [2, 3, 4]]

The cycle_basis() function returns a list of cycles in the graph, where each cycle is represented as a list of nodes. From the output, we can see that there are two cycles in the graph: [2, 4, 5, 6, 1] and [2, 3, 4]. These cycles represent closed paths in the graph that start and end at the same vertex, and that do not repeat any vertex or edge (except for the starting and ending vertex).

A planar graph is a graph that can be drawn on a two-dimensional plane without any of its edges crossing. In other words, a planar graph is a graph that can be represented by a diagram that can be drawn on a flat surface without any edges crossing each other. It’s not trivial to decide if a given graph is planar or not. For example, would you say that the following graph is planar or not?

import networkx as nx
import matplotlib.pyplot as plt

# Create a graph object with 15 nodes
G = nx.Graph()
G.add_nodes_from([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15])

# Add the edges
G.add_edges_from([(1,2),(1,3),(1,4),(1,5),(1,6),
                  (1,7),(1,8),(1,9),(1,10),(1,11),
                  (1,12),(1,13),(1,14),(1,15),(2,3),
                  (4,5),(6,7),(8,9),(10,11),(12,13),(14,15)])

nx.draw_random(G)
plt.show()

In networkx, we can check whether a graph is planar using the nx.check_planarity() function:

import networkx as nx

K5 = nx.complete_graph(5)

# Check if the graphs are planar
G_is_planar, _ = nx.check_planarity(G)
K5_is_planar, _ = nx.check_planarity(K5)

print(f"G is planar: {G_is_planar}")
print(f"K5 is planar: {K5_is_planar}")
G is planar: True
K5 is planar: False

We see that our graph GG indeed is planar, but the fully connected graph with only 5 nodes is not.

If a given graph is planar, then we can also make networkx draw it without edge crossings:

nx.draw_planar(G, with_labels=True)

In conclusion, graphs are an important mathematical and computer science tool used to model relationships between entities in a system. In this article, we have explored several common types of graphs, including circular graphs, utility graphs, and complete graphs. We have also discussed subgraphs, the degree of a vertex, the adjacency matrix, cycles, and planarity of graphs. These concepts are essential in graph theory and are used in a wide range of applications, from operations research and social sciences to computer science and network analysis. By using powerful tools like Python’s networkx library, we can easily create, visualize, and analyze graphs, enabling us to gain insights into complex systems.