Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download

📚 The CoCalc Library - books, templates and other resources

132923 views
License: OTHER
Kernel: Python 3

Small World Graphs

Code examples from Think Complexity, 2nd edition.

Copyright 2016 Allen Downey, MIT License

%matplotlib inline import matplotlib.pyplot as plt import networkx as nx import numpy as np import seaborn as sns from utils import decorate, savefig # I set the random seed so the notebook # produces the same results every time. np.random.seed(17) # TODO: remove this when NetworkX is fixed from warnings import simplefilter import matplotlib.cbook simplefilter("ignore", matplotlib.cbook.mplDeprecation)
# node colors for drawing networks colors = sns.color_palette('pastel', 5) #sns.palplot(colors) sns.set_palette(colors)

Regular ring lattice

To make a ring lattice, I'll start with a generator function that yields edges between each node and the next halfk neighbors.

def adjacent_edges(nodes, halfk): """Yields edges between each node and `halfk` neighbors. halfk: number of edges from each node """ n = len(nodes) for i, u in enumerate(nodes): for j in range(i+1, i+halfk+1): v = nodes[j % n] yield u, v

We can test it with 3 nodes and halfk=1

nodes = range(3) for edge in adjacent_edges(nodes, 1): print(edge)

Now we use adjacent_edges to write make_ring_lattice

def make_ring_lattice(n, k): """Makes a ring lattice with `n` nodes and degree `k`. Note: this only works correctly if k is even. n: number of nodes k: degree of each node """ G = nx.Graph() nodes = range(n) G.add_nodes_from(nodes) G.add_edges_from(adjacent_edges(nodes, k//2)) return G

And we can test it out with n=10 and k=4

lattice = make_ring_lattice(10, 4)
nx.draw_circular(lattice, node_color='C0', node_size=1000, with_labels=True) savefig('figs/chap03-1')

Exercise: To see how this function fails when k is odd, run it again with k=3 or k=5.

WS graph

To make a WS graph, you start with a ring lattice and then rewire.

def make_ws_graph(n, k, p): """Makes a Watts-Strogatz graph. n: number of nodes k: degree of each node p: probability of rewiring an edge """ ws = make_ring_lattice(n, k) rewire(ws, p) return ws

Here's the function that does the rewiring

def rewire(G, p): """Rewires each edge with probability `p`. G: Graph p: float """ nodes = set(G) for u, v in G.edges(): if flip(p): choices = nodes - {u} - set(G[u]) new_v = np.random.choice(list(choices)) G.remove_edge(u, v) G.add_edge(u, new_v) def flip(p): """Returns True with probability `p`.""" return np.random.random() < p

Here's an example with p=0.2

ws = make_ws_graph(10, 4, 0.2) nx.draw_circular(ws, node_color='C1', node_size=1000, with_labels=True)

Just checking that we have the same number of edges we started with:

len(lattice.edges()), len(ws.edges())

Now I'll generate a plot that shows WS graphs for a few values of p

n = 10 k = 4 ns = 100 plt.subplot(1,3,1) ws = make_ws_graph(n, k, 0) nx.draw_circular(ws, node_size=ns) plt.axis('equal') plt.subplot(1,3,2) ws = make_ws_graph(n, k, 0.2) nx.draw_circular(ws, node_size=ns) plt.axis('equal') plt.subplot(1,3,3) ws = make_ws_graph(n, k, 1.0) nx.draw_circular(ws, node_size=ns) plt.axis('equal') #TODO: Set figure size savefig('figs/chap03-2')

Exercise: What is the order of growth of rewire?

# Solution goes here

Clustering

The following function computes the local clustering coefficient for a given node, u:

def node_clustering(G, u): """Computes local clustering coefficient for `u`. G: Graph u: node returns: float """ neighbors = G[u] k = len(neighbors) if k < 2: return np.nan possible = k * (k-1) / 2 exist = 0 for v, w in all_pairs(neighbors): if G.has_edge(v, w): exist +=1 return exist / possible def all_pairs(nodes): """Generates all pairs of nodes.""" for i, u in enumerate(nodes): for j, v in enumerate(nodes): if i < j: yield u, v

The network average clustering coefficient is just the mean of the local CCs.

def clustering_coefficient(G): """Average of the local clustering coefficients. G: Graph returns: float """ cu = [node_clustering(G, node) for node in G] return np.nanmean(cu)

In a ring lattice with k=4, the clustering coefficient for each node should be 0.5

lattice = make_ring_lattice(10, 4) node_clustering(lattice, 1)

And the network average should be 0.5

clustering_coefficient(lattice)

Correct.

%timeit clustering_coefficient(lattice)

Exercise: Write a version of node_clustering that replaces the for loop with a list comprehension. Is it faster?

# Solution goes here
%timeit clustering_coefficient(lattice)

Exercise: What is the order of growth of clustering_coefficient in terms of n, m, and k?

# Solution goes here

Path length

The following function computes path lengths between all pairs of nodes

def path_lengths(G): length_iter = nx.shortest_path_length(G) for source, dist_map in length_iter: for dest, dist in dist_map.items(): yield dist

The characteristic path length is the mean path length for all pairs.

def characteristic_path_length(G): return np.mean(list(path_lengths(G)))

On a complete graph, the average path length should be 1

complete = nx.complete_graph(10) characteristic_path_length(complete)

On a ring lattice with n=1000 and k=10, the mean is about 50

lattice = make_ring_lattice(1000, 10) characteristic_path_length(lattice)

Exercise: What is the mean path length in a ring lattice with n=10 and k=4?

# Solution goes here

The experiment

This function generates a WS graph with the given parameters and returns a pair of (mean path length, clustering coefficient):

def run_one_graph(n, k, p): """Makes a WS graph and computes its stats. n: number of nodes k: degree of each node p: probability of rewiring returns: tuple of (mean path length, clustering coefficient) """ ws = make_ws_graph(n, k, p) mpl = characteristic_path_length(ws) cc = clustering_coefficient(ws) print(mpl, cc) return mpl, cc

With n=1000 and k=10, it takes a few seconds on my computer:

%time run_one_graph(1000, 10, 0.01)

Now we'll run it with a range of values for p.

ps = np.logspace(-4, 0, 9) print(ps)

This function runs each value of p 20 times and returns a dictionary that maps from each p to a list of (mpl, cc) pairs.

def run_experiment(ps, n=1000, k=10, iters=20): """Computes stats for WS graphs with a range of `p`. ps: sequence of `p` to try n: number of nodes k: degree of each node iters: number of times to run for each `p` returns: """ res = [] for p in ps: print(p) t = [run_one_graph(n, k, p) for _ in range(iters)] means = np.array(t).mean(axis=0) print(means) res.append(means) return np.array(res)

Here are the raw results. Warning: this takes a few minutes to run.

%time res = run_experiment(ps)
res

Let's get the results into a form that's easy to plot.

L, C = np.transpose(res)
L
C

And normalize them so they both start at 1.0

L /= L[0] C /= C[0]

Here's the plot that replicates Watts and Strogatz's Figure 2.

plt.plot(ps, C, 's-', linewidth=1, label='C(p) / C(0)') plt.plot(ps, L, 'o-', linewidth=1, label='L(p) / L(0)') decorate(xlabel='Rewiring probability (p)', xscale='log', title='Normalized clustering coefficient and path length', xlim=[0.00009, 1.1], ylim=[-0.01, 1.01]) savefig('figs/chap03-3')

Now let's see how the shortest path algorithm works. We'll start with BFS, which is the basis for Dijkstra's algorithm.

Here's our old friend, the ring lattice:

lattice = make_ring_lattice(10, 4)
nx.draw_circular(lattice, node_color='C2', node_size=1000, with_labels=True)

And here's my implementation of BFS using a deque.

from collections import deque def reachable_nodes_bfs(G, start): """Finds reachable nodes by BFS. G: graph start: node to start at returns: set of reachable nodes """ seen = set() queue = deque([start]) while queue: node = queue.popleft() if node not in seen: seen.add(node) queue.extend(G.neighbors(node)) return seen

It works:

reachable_nodes_bfs(lattice, 0)

Here's a version that's a little faster, but maybe less readable.

def reachable_nodes_bfs(G, start): """Finds reachable nodes by BFS. G: graph start: node to start at returns: set of reachable nodes """ seen = set() queue = deque([start]) while queue: node = queue.popleft() if node not in seen: seen.add(node) neighbors = set(G[node]) - seen queue.extend(neighbors) return seen

It works, too.

reachable_nodes_bfs(lattice, 0)

Dijkstra's algorithm

Now we're ready for Dijkstra's algorithm, at least for graphs where all the edges have the same weight/length.

def shortest_path_dijkstra(G, source): """Finds shortest paths from `source` to all other nodes. G: graph source: node to start at returns: make from node to path length """ dist = {source: 0} queue = deque([source]) while queue: node = queue.popleft() new_dist = dist[node] + 1 neighbors = set(G[node]).difference(dist) for n in neighbors: dist[n] = new_dist queue.extend(neighbors) return dist

Again, we'll test it on a ring lattice.

lattice = make_ring_lattice(10, 4)
nx.draw_circular(lattice, node_color='C3', node_size=1000, with_labels=True)

Here's my implementation:

d1 = shortest_path_dijkstra(lattice, 0) d1

And here's the result from NetworkX:

d2 = nx.shortest_path_length(lattice, 0) d2

They are the same:

d1 == d2

Exercise: In a ring lattice with n=1000 and k=10, which node is farthest from 0 and how far is it? Use shortest_path_dijkstra to check your answer.

Note: the maximum distance between two nodes is the diameter of the graph.

# Solution goes here

Exercises

Exercise: In a ring lattice, every node has the same number of neighbors. The number of neighbors is called the degree of the node, and a graph where all nodes have the same degree is called a regular graph.

All ring lattices are regular, but not all regular graphs are ring lattices. In particular, if k is odd, we can't construct a ring lattice, but we might be able to construct a regular graph.

Write a function called make_regular_graph that takes n and k and returns a regular graph that contains n nodes, where every node has k neighbors. If it's not possible to make a regular graph with the given values of n and k, the function should raise a ValueError.

# Solution goes here
# Solution goes here
# Solution goes here
# Solution goes here

Exercise: My implementation of reachable_nodes_bfs is efficient in the sense that it is in O(n+m)O(n + m), but it incurs a lot of overhead adding nodes to the queue and removing them. NetworkX provides a simple, fast implementation of BFS, available from the NetworkX repository on GitHub.

Here is a version I modified to return a set of nodes:

def plain_bfs(G, start): """A fast BFS node generator""" seen = set() nextlevel = {start} while nextlevel: thislevel = nextlevel nextlevel = set() for v in thislevel: if v not in seen: seen.add(v) nextlevel.update(G[v]) return seen

Compare this function to reachable_nodes_bfs and see which is faster. Then see if you can modify this function to implement a faster version of shortest_path_dijkstra

# Solution goes here
# Solution goes here
# Solution goes here
# Solution goes here
# Solution goes here
# Solution goes here
# Solution goes here
# Solution goes here

Exercise: The following implementation of a BFS contains two performance errors. What are they? What is the actual order of growth for this algorithm?

def bfs(G, start): """Breadth-first search on a graph, starting at top_node.""" visited = set() queue = [start] while len(queue): curr_node = queue.pop(0) # Dequeue visited.add(curr_node) # Enqueue non-visited and non-enqueued children queue.extend(c for c in G[curr_node] if c not in visited and c not in queue) return visited
# Solution goes here

Exercise: In the book, I claimed that Dijkstra's algorithm does not work unless it uses BFS. Write a version of shortest_path_dijkstra that uses DFS and test it on a few examples to see what goes wrong.

# Solution goes here