📚 The CoCalc Library - books, templates and other resources
License: OTHER
Erdos-Renyi Graphs
Code examples from Think Complexity, 2nd edition.
Copyright 2016 Allen Downey, MIT License
Directed graph
The first example is a directed graph that represents a social network with three nodes.
Here's how we add edges between nodes.
And here's how to draw the graph.
Exercise: Add another node and a few more edges and draw the graph again.
Undirected graph
The second example is an undirected graph that represents cities and the driving times between them.
positions is a dictionary that maps from each city to its coordinates.
We can use the keys in pos to add nodes to the graph.
drive_times is a dictionary that maps from pairs of cities to the driving times between them.
We can use the keys from drive_times to add edges to the graph.
Now we can draw the graph using positions to indicate the positions of the nodes, and drive_times to label the edges.
Exercise: Add another city and at least one edge.
Complete graph
To make a complete graph, we use a generator function that iterates through all pairs of nodes.
make_complete_graph makes a Graph with the given number of nodes and edges between all pairs of nodes.
Here's a complete graph with 10 nodes:
And here's what it looks like.
The neighbors method the neighbors for a given node.
Exercise: Make and draw complete directed graph with 5 nodes.
Random graphs
Next we'll make a random graph where the probability of an edge between each pair of nodes is .
The helper function flip returns True with probability p and False with probability 1-p
random_pairs is a generator function that enumerates all possible pairs of nodes and yields each one with probability p
make_random_graph makes an ER graph where the probability of an edge between each pair of nodes is p.
Here's an example with n=10 and p=0.3
And here's what it looks like:
Connectivity
To check whether a graph is connected, we'll start by finding all nodes that can be reached, starting with a given node:
In the complete graph, starting from node 0, we can reach all nodes:
In the random graph we generated, we can also reach all nodes (but that's not always true):
We can use reachable_nodes to check whether a graph is connected:
Again, the complete graph is connected:
But if we generate a random graph with a low value of p, it's not:
Exercise: What do you think it means for a directed graph to be connected? Write a function that checks whether a directed graph is connected.
Probability of connectivity
Now let's estimare the probability that a randomly-generated ER graph is connected.
This function takes n and p, generates iters graphs, and returns the fraction of them that are connected.
With n=10 and p=0.23, the probability of being connected is about 33%.
According to Erdos and Renyi, the critical value of p for n=10 is about 0.23.
So let's plot the probability of connectivity for a range of values for p
I'll estimate the probabilities with iters=1000
And then plot them, adding a vertical line at the computed critical value
We can run the same analysis for a few more values of n.
As n increases, the critical value gets smaller and the transition gets more abrupt.
Exercises
Exercise: In Chapter 2 we analyzed the performance of reachable_nodes and classified it in , where is the number of nodes and is the number of edges. Continuing the analysis, what is the order of growth for is_connected?
Exercise: In my implementation of reachable_nodes, you might be bothered by the apparent inefficiency of adding all neighbors to the stack without checking whether they are already in seen. Write a version of this function that checks the neighbors before adding them to the stack. Does this "optimization" change the order of growth? Does it make the function faster?
Exercise: There are actually two kinds of ER graphs. The one we generated in the chapter, , is characterized by two parameters, the number of nodes and the probability of an edge between nodes.
An alternative definition, denoted , is also characterized by two parameters: the number of nodes, , and the number of edges, . Under this definition, the number of edges is fixed, but their location is random.
Repeat the experiments we did in this chapter using this alternative definition. Here are a few suggestions for how to proceed:
Write a function called
m_pairsthat takes a list of nodes and the number of edges, , and returns a random selection of edges. A simple way to do that is to generate a list of all possible edges and userandom.sample.Write a function called
make_m_graphthat takes and and returns a random graph with nodes and edges.Make a version of
prob_connectedthat usesmake_m_graphinstead ofmake_random_graph.Compute the probability of connectivity for a range of values of .
How do the results of this experiment compare to the results using the first type of ER graph?