📚 The CoCalc Library - books, templates and other resources
License: OTHER
Erdos-Renyi Graphs
Code examples from Think Complexity, 2nd edition.
Copyright 2016 Allen Downey, MIT License
Directed graph
The first example is a directed graph that represents a social network with three nodes.
Here's how we add edges between nodes.
And here's how to draw the graph.
Exercise: Add another node and a few more edges and draw the graph again.
Undirected graph
The second example is an undirected graph that represents cities and the driving times between them.
positions
is a dictionary that maps from each city to its coordinates.
We can use the keys in pos
to add nodes to the graph.
drive_times
is a dictionary that maps from pairs of cities to the driving times between them.
We can use the keys from drive_times
to add edges to the graph.
Now we can draw the graph using positions
to indicate the positions of the nodes, and drive_times
to label the edges.
Exercise: Add another city and at least one edge.
Complete graph
To make a complete graph, we use a generator function that iterates through all pairs of nodes.
make_complete_graph
makes a Graph
with the given number of nodes and edges between all pairs of nodes.
Here's a complete graph with 10 nodes:
And here's what it looks like.
The neighbors
method the neighbors for a given node.
Exercise: Make and draw complete directed graph with 5 nodes.
Random graphs
Next we'll make a random graph where the probability of an edge between each pair of nodes is .
The helper function flip
returns True with probability p
and False with probability 1-p
random_pairs
is a generator function that enumerates all possible pairs of nodes and yields each one with probability p
make_random_graph
makes an ER graph where the probability of an edge between each pair of nodes is p
.
Here's an example with n=10
and p=0.3
And here's what it looks like:
Connectivity
To check whether a graph is connected, we'll start by finding all nodes that can be reached, starting with a given node:
In the complete graph, starting from node 0, we can reach all nodes:
In the random graph we generated, we can also reach all nodes (but that's not always true):
We can use reachable_nodes
to check whether a graph is connected:
Again, the complete graph is connected:
But if we generate a random graph with a low value of p
, it's not:
Exercise: What do you think it means for a directed graph to be connected? Write a function that checks whether a directed graph is connected.
Probability of connectivity
Now let's estimare the probability that a randomly-generated ER graph is connected.
This function takes n
and p
, generates iters
graphs, and returns the fraction of them that are connected.
With n=10
and p=0.23
, the probability of being connected is about 33%.
According to Erdos and Renyi, the critical value of p
for n=10
is about 0.23.
So let's plot the probability of connectivity for a range of values for p
I'll estimate the probabilities with iters=1000
And then plot them, adding a vertical line at the computed critical value
We can run the same analysis for a few more values of n
.
As n
increases, the critical value gets smaller and the transition gets more abrupt.
Exercises
Exercise: In Chapter 2 we analyzed the performance of reachable_nodes
and classified it in , where is the number of nodes and is the number of edges. Continuing the analysis, what is the order of growth for is_connected
?
Exercise: In my implementation of reachable_nodes
, you might be bothered by the apparent inefficiency of adding all neighbors to the stack without checking whether they are already in seen
. Write a version of this function that checks the neighbors before adding them to the stack. Does this "optimization" change the order of growth? Does it make the function faster?
Exercise: There are actually two kinds of ER graphs. The one we generated in the chapter, , is characterized by two parameters, the number of nodes and the probability of an edge between nodes.
An alternative definition, denoted , is also characterized by two parameters: the number of nodes, , and the number of edges, . Under this definition, the number of edges is fixed, but their location is random.
Repeat the experiments we did in this chapter using this alternative definition. Here are a few suggestions for how to proceed:
Write a function called
m_pairs
that takes a list of nodes and the number of edges, , and returns a random selection of edges. A simple way to do that is to generate a list of all possible edges and userandom.sample
.Write a function called
make_m_graph
that takes and and returns a random graph with nodes and edges.Make a version of
prob_connected
that usesmake_m_graph
instead ofmake_random_graph
.Compute the probability of connectivity for a range of values of .
How do the results of this experiment compare to the results using the first type of ER graph?