A Primer in Graph Theory with Python
Graph theory is a branch of mathematics and computer science that deals with the study of graphs, which are a collection of vertices and edges that connect them. Graphs are used to model relationships between objects or entities in a system, and are widely used in various fields including computer science, operations research, social sciences, and more. In this article, we will introduce some of the common types of graphs, such as circular graphs, utility graphs, and complete graphs. We will also discuss subgraphs, the degree of a vertex, adjacency matrices, cycles, and planarity. We will use Python’s networkx library, a powerful tool for working with graphs and creating graph diagrams, to illustrate and analyze these concepts.
If you haven’t already done so, install networkx by submitting
pip install networkx
to follow along this tutorial.
So let’s introduce a few common graphs. Circular graphs are a type of graph that can be drawn on a circle. They are often used to represent cyclic or periodic data. In a circular graph, each vertex is evenly spaced around the circle, and the edges connect adjacent vertices.
Here’s an example of a circular graph with 6 nodes and 6 edges,
created using networkx and its the
nx.cycle_graph()
function to
create the graph object. We then use the
nx.draw_circular()
function
to draw the graph in a circular layout.
import networkx as nx
import matplotlib.pyplot as plt
# Create a circular graph object
G = nx.cycle_graph(6)
# Draw the circular graph
nx.draw_circular(G, with_labels=True)
plt.show()
Another common type of graph is the so-called utility graph. Utility graphs are often used to model situations where one set of objects provides a utility or benefit to the other set of objects. In a utility graph, each node represents an object from one set, and each edge represents a connection to an object in the other set.
Here’s an example of a utility graph with 5 nodes in each set:
import networkx as nx
import matplotlib.pyplot as plt
# Create a utility graph object
G = nx.Graph()
# Add nodes to the graph
G.add_nodes_from([1,2,3,4,5], bipartite=0)
G.add_nodes_from([6,7,8,9,10], bipartite=1)
# Add edges to the graph
for k in range(1, 6):
for j in range(6, 11):
G.add_edge(k, j)
# Draw the utility graph
pos = nx.bipartite_layout(G, [1,2,3,4,5])
nx.draw(G, pos, with_labels=True)
plt.show()
Here, we have used the
bipartite
option while
adding nodes. The bipartite option in networkx is used to
specify the set to which a node belongs in a bipartite graph. A
bipartite graph is a type of graph where the nodes can be
divided into two sets such that each edge connects a node in one
set to a node in the other set. In other words, there are no
edges between nodes in the same set.
When creating a bipartite graph in networkx, you can use the
bipartite
option to
specify which set a node belongs to. This is useful when you
want to perform bipartite graph algorithms on the graph, such as
finding maximum matchings or performing network analysis on
bipartite graphs.
As another example of a very common type of graph, consider the so-called complete graph. It is called complete because every pair of distinct nodes is connected by an edge. In other words, a complete graph is a graph where each node is directly connected to every other node in the graph.
A complete graph with n nodes is often denoted as Kn, where n is the number of nodes. The total number of edges in a complete graph with n nodes is n(n−1)/2.
Since it is so common, networkx provides a convenience function to generate this type of graph:
import networkx as nx
import matplotlib.pyplot as plt
# Create a complete graph object with 7 nodes
G = nx.complete_graph(7)
# Draw the complete graph
nx.draw_circular(G, with_labels=True)
plt.show()
Let’s assert that the number of edges is really n(n−1)/2. We get the edges by
G.edges
EdgeView([(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 3), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6), (4, 5), (4, 6), (5, 6)])
and the nodes by
G.nodes
NodeView((0, 1, 2, 3, 4, 5, 6))
so our assertion is
n = len(G.nodes)
len(G.edges) == n * (n-1) / 2
True
The complement of a graph is another graph that contains all the nodes that are not present in the original graph, but does not contain any of the edges that are present in the original graph. In other words, if G is a graph, then the complement of G is a graph that has the same set of nodes as G, but contains all the edges that are not present in G. The complement of a graph is often denoted as G′.
Here’s an example:
import networkx as nx
import matplotlib.pyplot as plt
# Create a graph object with 5 nodes and 3 edges
G = nx.Graph()
G.add_nodes_from([1,2,3,4,5])
G.add_edges_from([(1,2),(2,3),(4,5)])
# Create the complement graph
G_comp = nx.complement(G)
# Draw both graphs
plt.subplot(121)
nx.draw_circular(G, with_labels=True)
plt.title('Original Graph')
plt.subplot(122)
nx.draw_circular(G_comp, with_labels=True)
plt.title('Complement Graph')
plt.show()
A subgraph is a graph that is formed by taking a subset of the vertices and edges of an original graph. In other words, if G is a graph, then any graph H that is formed by taking a subset of the vertices and edges of G is a subgraph of G. A subgraph can be formed by removing one or more vertices or edges from the original graph, or by selecting a subset of the vertices and edges from the original graph.
Again, an example should make things more clear:
import networkx as nx
import matplotlib.pyplot as plt
# Create a graph object with 6 nodes and 7 edges
G = nx.Graph()
G.add_nodes_from([1,2,3,4,5,6])
G.add_edges_from([(1,2),(2,3),(3,4),(4,5),(5,6),(6,1),(2,4)])
# Create a subgraph by selecting a subset of the edges
H = G.edge_subgraph([(1,2),(2,3),(3,4),(4,5)])
# Draw both graphs
plt.subplot(121)
nx.draw_circular(G, with_labels=True)
plt.title('Original Graph')
plt.subplot(122)
nx.draw_circular(H, with_labels=True)
plt.title('Subgraph')
plt.show()
Here, we have created a subgraph by selecting a subset of the
edges from the original graph using the
edge_subgraph()
function.
The resulting subgraph contains only the nodes and edges that
were selected from the original graph. We can see that in the
subgraph diagram, only the selected edges and their
corresponding nodes are present, while the other edges and nodes
are absent.
There are some metrics that allow for a better analysis of graphs. One of the most important metrics is the degree. The degree of a vertex is the number of edges that are connected to the vertex. In other words, the degree of a vertex is the number of other vertices that are directly connected to that vertex.
In networkx, we can calculate the degree of a vertex using the degree() function.
import networkx as nx
import matplotlib.pyplot as plt
# Create a graph object with 6 nodes and 7 edges
G = nx.Graph()
G.add_nodes_from([1,2,3,4,5,6])
G.add_edges_from([(1,2),(2,3),(3,4),(4,5),(5,6),(6,1),(2,4)])
# Calculate the degree of each vertex
degrees = dict(G.degree())
# Print the degree of each vertex
for vertex, degree in degrees.items():
print("Vertex", vertex, "has degree", degree)
# Draw the graph
nx.draw(G, with_labels=True)
plt.show()
Vertex 1 has degree 2
Vertex 2 has degree 3
Vertex 3 has degree 2
Vertex 4 has degree 3
Vertex 5 has degree 2
Vertex 6 has degree 2
As nice as a graphical representation is, sometimes we want something that is better suited for computation. Enter the adjacency matrix. The adjacency matrix is a square matrix that represents the connections between the nodes of a graph. In an adjacency matrix, the rows and columns of the matrix represent the nodes of the graph, and the entries in the matrix indicate whether there is an edge between each pair of nodes.
If G is a graph with n nodes, the adjacency matrix A of G is an n x n matrix such that A[i,j] is 1 if there is an edge from node i to node j, and 0 otherwise. If the graph is undirected, the adjacency matrix is symmetric, meaning that A[i,j] = A[j,i] for all i and j.
The adjacency matrix is a useful tool for analyzing the properties of a graph, as it allows us to quickly determine which nodes are connected to each other, and how many edges are present in the graph. The adjacency matrix can also be used to calculate various metrics of the graph, such as the degree sequence, the number of cycles, and the eigenvalues of the graph.
In networkx, we can generate the adjacency matrix of a graph
using the
nx.to_numpy_array()
function. Here's an example of how to generate the
adjacency matrix for a graph with 4 nodes and 4 edges:
import networkx as nx
import numpy as np
# Create a graph object with 4 nodes and 4 edges
G = nx.Graph()
G.add_nodes_from([1,2,3,4])
G.add_edges_from([(1,2),(2,3),(3,4),(4,1)])
# Generate the adjacency matrix
adj_mat = nx.to_numpy_array(G)
nx.draw_circular(G, with_labels=True)
# Print the adjacency matrix
print(adj_mat)
[[0. 1. 0. 1.]
[1. 0. 1. 0.]
[0. 1. 0. 1.]
[1. 0. 1. 0.]]
From the output, we can see that the adjacency matrix is a 4 x 4 matrix, where each entry represents the presence or absence of an edge between two nodes. For example, the entry in the first row and second column is 1, indicating that there is an edge between nodes 1 and 2. And since we have an indirected graph, the matrix is symmetric.
A cycle in a graph is a path of edges that starts and ends at the same vertex, and that does not repeat any vertex or edge (except for the starting and ending vertex, which are the same). In other words, a cycle is a closed path that visits a sequence of nodes without visiting the same node twice, except for the starting and ending node.
Cycles are an important concept in graph theory, as they are used to study the properties of graphs and to develop algorithms for graph problems. For example, cycles can be used to detect whether a graph is bipartite or not, and to find the shortest path between two nodes in a graph.
In a directed graph, a cycle is called a directed cycle or a circuit, and it is defined as a path of directed edges that starts and ends at the same vertex, and that does not repeat any vertex or edge (except for the starting and ending vertex, which are the same).
Cycles can be found using various algorithms, such as
depth-first search or breadth-first search. In networkx, we can
find cycles in a graph using the
nx.find_cycle
function.
import networkx as nx
# Create a graph object with 6 nodes and 7 edges
G = nx.Graph()
G.add_nodes_from([1,2,3,4,5,6])
G.add_edges_from([(1,2),(2,3),(3,4),(4,5),(5,6),(6,1),(2,4)])
# Find cycles in the graph
cycles = nx.cycle_basis(G)
nx.draw_circular(G, with_labels=True)
# Print the cycles
print(cycles)
[[2, 4, 5, 6, 1], [2, 3, 4]]
The cycle_basis() function returns a list of cycles in the graph, where each cycle is represented as a list of nodes. From the output, we can see that there are two cycles in the graph: [2, 4, 5, 6, 1] and [2, 3, 4]. These cycles represent closed paths in the graph that start and end at the same vertex, and that do not repeat any vertex or edge (except for the starting and ending vertex).
A planar graph is a graph that can be drawn on a two-dimensional plane without any of its edges crossing. In other words, a planar graph is a graph that can be represented by a diagram that can be drawn on a flat surface without any edges crossing each other. It’s not trivial to decide if a given graph is planar or not. For example, would you say that the following graph is planar or not?
import networkx as nx
import matplotlib.pyplot as plt
# Create a graph object with 15 nodes
G = nx.Graph()
G.add_nodes_from([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15])
# Add the edges
G.add_edges_from([(1,2),(1,3),(1,4),(1,5),(1,6),
(1,7),(1,8),(1,9),(1,10),(1,11),
(1,12),(1,13),(1,14),(1,15),(2,3),
(4,5),(6,7),(8,9),(10,11),(12,13),(14,15)])
nx.draw_random(G)
plt.show()
In networkx, we can check whether a graph is planar using the
nx.check_planarity()
function:
import networkx as nx
K5 = nx.complete_graph(5)
# Check if the graphs are planar
G_is_planar, _ = nx.check_planarity(G)
K5_is_planar, _ = nx.check_planarity(K5)
print(f"G is planar: {G_is_planar}")
print(f"K5 is planar: {K5_is_planar}")
G is planar: True
K5 is planar: False
We see that our graph GG indeed is planar, but the fully connected graph with only 5 nodes is not.
If a given graph is planar, then we can also make networkx draw it without edge crossings:
nx.draw_planar(G, with_labels=True)
In conclusion, graphs are an important mathematical and computer science tool used to model relationships between entities in a system. In this article, we have explored several common types of graphs, including circular graphs, utility graphs, and complete graphs. We have also discussed subgraphs, the degree of a vertex, the adjacency matrix, cycles, and planarity of graphs. These concepts are essential in graph theory and are used in a wide range of applications, from operations research and social sciences to computer science and network analysis. By using powerful tools like Python’s networkx library, we can easily create, visualize, and analyze graphs, enabling us to gain insights into complex systems.