Getting Started with Graphs and NetworkX in Python

Photo by Clint Adair on Unsplash

Graphs are a fundamental mathematical concept used to represent relationships and connections between entities. They have a wide range of applications across various fields, including social sciences, computer science, biology, and finance. Graphs can be used to model complex systems, analyze data, and make predictions. One popular tool for working with graphs in Python is the networkx package. This package provides a flexible and intuitive API for creating, manipulating, and analyzing complex networks or graphs. In this article, we will cover the basic definition of graphs, and how to create and visualize graphs using networkx. Furthermore, as an example of graph analysis algorithms, we will find the shortest path inside a graph between two different points.

What are Graphs?

Mathematical graphs are mathematical structures that consist of nodes (also called vertices) and edges. The nodes represent the entities in the graph, and the edges represent the relationships or connections between the entities. Graphs can be used to model various real-world problems, such as social networks, transportation systems, and computer networks, to name a few examples.

Directed graphs, also called digraphs, are graphs where the edges have a direction. This means that the edges only go from one node to another in a specific direction. In contrast, undirected graphs do not have a direction, and the edges can go from one node to another in any direction.

Multigraphs, finally, are graphs that can have multiple edges between the same two nodes.

Graphs in Python: Introducing networkx

NetworkX is a Python package for creating, manipulating, and analyzing complex networks or graphs. It provides a simple and flexible API for constructing graphs and performing various graph-related operations.

To use NetworkX, you need to install it using pip . You can do this by typing the following command in your terminal or command prompt:

pip install networkx

Once you have installed NetworkX, you can start using it in a Python script or a Jupyter notebook. To get started, you need to import the networkx and matplotlib packages, which are used for plotting and visualizing graphs. Here’s an example of how to create a simple graph using NetworkX and plot it using matplotlib:

In this example, we first import the networkx and matplotlib packages using the import statement. Then, we create an empty graph using the Graph class of NetworkX. We add two nodes with IDs 1 and 2 using the add_node() function, and an edge between them using the add_edge() function. Finally, we use the draw() function of NetworkX to plot the graph with labels and the show() function of matplotlib to actually display it.

As node names, we could have chosen anything. Numbers, strings, Python objects, whatever! As you can see, Graph creates an undirected graph. We can also use directed graphs, which are called DiGraph in networkx . Let's define another simple graph, but this time a directed one. Adding nodes one by one is tedious. But fortunately, networkx offers the add_notes_from() method which can be passes a list or iterable. The same applies for adding edges:

We can adjust the plot as we like using options dictionaries. For example, we can make this a bit more schematic by drawing everything in black:

The main purpose of networkx is not drawing such simple graphs, of course. Instead, the real power of networkx is to analyze graphs. For instance, we can easily find the shortest path between two given nodes. To illustrate this, we’ll create some bigger, random graph.

import random

# Create a random graph of 100 nodes:

num_nodes = 100

G = nx.Graph()
G.add_nodes_from(range(num_nodes))

# Give each node an edge to two other nodes, randomly chosen
for i in range(num_nodes):
    G.add_edge(i, random.randint(0, num_nodes-1))
    G.add_edge(i, random.randint(0, num_nodes-1))

# We are looking for the shortest distance from node 0 to node 99.
# Paint node 0 red and node 99 green.

color_map = ["gray"] * num_nodes
color_map[0] = "red"
color_map[-1] = "green"
    
nx.draw(G, node_color=color_map, with_labels=True)

This code example creates a random graph with 100 nodes and edges between them. It uses the random module from Python’s standard library to randomly add edges to each node, by choosing two other nodes randomly.

Then, the colors of nodes are assigned based on whether they are the starting node (red) or the ending node (green). This is done by creating a color_map list, where each element corresponds to a node in the graph. The first element of the color_map list is set to “red”, which corresponds to the starting node, and the last element is set to “green”, which corresponds to the ending node.

Finally, the draw() function of NetworkX is used to plot the graph with the assigned colors. The node_color parameter is set to the color_map list, which maps each node to its corresponding color.

Next, let’s compute the shortest path between the red and the green nodes:

Here, the shortest path between the starting (red) and ending (green) nodes in the random graph created in the previous example is computed using the shortest_path() function of NetworkX. The function takes as input the graph and the IDs of the starting and ending nodes. The result is stored in the variable path, which is a list of nodes that form the shortest path between the two nodes.

Let’s paint the shortest path red, i.e. all edges that are on the shortest path shall be drawn red, all others black:

edges_on_path = [(path[i], path[i+1]) for i in range(len(path)-1)]
edge_colors = []
for edge in G.edges:
    if edge in edges_on_path \
            or tuple(reversed(edge)) in edges_on_path:
        edge_colors.append("red")
    else:
        edge_colors.append("black")
        
nx.draw_spring(G, node_color=color_map, edge_color=edge_colors, with_labels=True)

The edges that are on the shortest path are stored in the edges_on_path list. This is done by iterating over the nodes in the path list and adding the edges between adjacent nodes to the edges_on_path list.

Then, the edge_colors list is created based on whether the edge is on the shortest path or not. If the edge is on the shortest path, its color is set to "red". Otherwise, its color is set to "black".

Finally, the draw_spring() function of NetworkX is used to plot the graph with the assigned node and edge colors. The node_color parameter is set to the color_map list, which maps each node to its corresponding color. The edge_color parameter is set to the edge_colors list, which maps each edge to its corresponding color.

You may wonder, why I used nx.draw in the first example but nx.draw_spring in the other example. Well, nx.draw() is just basic function in NetworkX for drawing a graph with a set of default parameters. It draws the graph using a simple algorithm that places the nodes in a circle and positions the edges accordingly. It can be used to create a basic visualization of the graph, but it may not be suitable for more complex graphs. nx.draw_spring() is a function in NetworkX that draws a graph using a force-directed layout algorithm. This algorithm simulates a physical system where the nodes repel each other and the edges act as springs that pull the nodes closer together. The resulting layout often tends to be more aesthetically pleasing and easier to interpret than the layout generated by nx.draw() . Other draw functions in NetworkX include:

Each of these functions has its own set of parameters for adjusting the layout and appearance of the graph. For example, you can adjust the node size, node color, edge color, and edge width using the options dictionaries in each of these functions. So, you have a lot of options, and you have to play a little bit to find out which option works best for your specific use case.

Conclusion

In conclusion, graphs are a powerful mathematical tool for modeling relationships and connections between entities in various fields. By using the networkx package in Python, you can easily create and analyze complex graphs, including directed and undirected graphs, multigraphs, and more. In this article, we have barely scratched the surface. We hope this article has whetted your appetite for learning more about graphs and networkx and has set you on track to apply them to your projects.