Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download

Jupyter notebook M725F17/05 - Graph Coloring/Graph Coloring.ipynb

1516 views
Kernel: Python 2 (SageMath)

Graph Coloring

by Nathan Albin <[email protected]>
created 2/8/2017, updated 2/12/2017

The graph coloring problem

Graph coloring is a famous type of problem in graph theory. The most basic problem of this type is the following. Suppose you are given a network G=(V,E)G=(V,E) with vertices VV and edges EE along with kk different colors. Is it possible to assign a color to each vertex in such a way that no two vertices connected by an edge have the same color. For example, a triangle (the complete graph on 3 vertices, K3K_3) can be colored with k3k\ge 3 but not with k<3k<3. The smallest number of colors required to color a given graph GG is called its chromatic number, χ(G)\chi(G). So χ(K3)=3\chi(K_3)=3. Also, a graph GG is bipartite if and only if χ(G)=2\chi(G)=2. The famous Four Color Theorem states that every planar graph can be colored with 44 colors.

One obvious upper bound on the chromatic number of a general graph is χ(G)V\chi(G) \le |V|, just color each vertex a unique color.

A slightly better bound is χ(G)maxvVdeg(v)+1, \chi(G) \le \max_{v\in V}\deg(v)+1, that is, the chromatic number of a graph is no larger than 1 plus the maximum degree of any vertex in GG. One way to prove this is via a simple algorithm.

Basic algorithm:

Whenever you must choose a color for a vertex that is still blank, simply pick a color that is different from any colors already assigned to the neighbors.

To begin, below we implement this basic algorithm. Read the code below line by line and make sure you understand what is happening. If you are unsure, don't hesitate to insert a new cell from the toolbar under the word JUPYTER above, and print out some of the objects that are being called.

%matplotlib inline import matplotlib.pyplot as plt import networkx as nx
# Create a random graph. The "seed" ensures reproducibility, which is handy when # first testing code. Change the seed to change the graph. seed = 818829 n = 10 p = 0.3 G = nx.gnp_random_graph(n,p,seed) # draw the graph pos = nx.spring_layout(G) plt.figure(figsize=(5,5)) nx.draw(G,pos,node_color='white',width=2) # get the maximum vertex degree max_deg = max( G.degree().values() ) print 'maximum degree = {}'.format(max_deg)
maximum degree = 5
<matplotlib.figure.Figure at 0x7f059c082d50>
# Create the set of possible colors. num_colors = max_deg+1 colors = set(xrange(num_colors)) # We'll store the color of each node as an integer property called 'color'. Start by initializing # all the colors to -1 (meaning not yet colored) for v in G.nodes_iter(): G.node[v]['color'] = -1 # loop over all nodes for v in G.nodes_iter(): print 'processing node {}'.format(v) # copy the possible colors possible = colors.copy() # find colors of neighbors nbrs = set( [ G.node[w]['color'] for w in G.neighbors_iter(v) if G.node[w]['color'] <> -1 ] ) # remove neighbor colors from the set of possible colors possible -= nbrs # pick a color for the current node c = next(iter(possible)) print ' neighbor colors: ', nbrs print ' possible colors: ', possible print ' selected color: ', c #update the color G.node[v]['color'] = c
processing node 0 neighbor colors: set([]) possible colors: set([0, 1, 2, 3, 4, 5]) selected color: 0 processing node 1 neighbor colors: set([0]) possible colors: set([1, 2, 3, 4, 5]) selected color: 1 processing node 2 neighbor colors: set([1]) possible colors: set([0, 2, 3, 4, 5]) selected color: 0 processing node 3 neighbor colors: set([0]) possible colors: set([1, 2, 3, 4, 5]) selected color: 1 processing node 4 neighbor colors: set([1]) possible colors: set([0, 2, 3, 4, 5]) selected color: 0 processing node 5 neighbor colors: set([0]) possible colors: set([1, 2, 3, 4, 5]) selected color: 1 processing node 6 neighbor colors: set([1]) possible colors: set([0, 2, 3, 4, 5]) selected color: 0 processing node 7 neighbor colors: set([0]) possible colors: set([1, 2, 3, 4, 5]) selected color: 1 processing node 8 neighbor colors: set([1]) possible colors: set([0, 2, 3, 4, 5]) selected color: 0 processing node 9 neighbor colors: set([0]) possible colors: set([1, 2, 3, 4, 5]) selected color: 1 processing node 10 neighbor colors: set([0, 1]) possible colors: set([2, 3, 4, 5]) selected color: 2
plt.figure(figsize=(5,5)) nx.draw(G,pos,node_color=[d['color'] for v,d in G.nodes_iter(data=True)],width=2,cmap=plt.cm.Set3)
<matplotlib.figure.Figure at 0x7f05937edc90>
print G.nodes(data=True)
[(0, {'color': 0}), (1, {'color': 0}), (2, {'color': 1}), (3, {'color': 1}), (4, {'color': 2}), (5, {'color': 2}), (6, {'color': 3}), (7, {'color': 1}), (8, {'color': 1}), (9, {'color': 0})]
Exercise 1
In the preceding code, notice that I used the Python object called a set. In a sentence or two, explain what properties of a set make it more beneficial than a list for the task at hand. Type your answer where indicated below. (Hint: If you don't know, try asking Google.)
Answer

It is easier to implement set where we need to check uniqness of elements and we need to chack a new element is in the set or not. It is more suitable for set operations, for example union, intersection etc.

A little better visualization

For some graphs, it can be a little tricky to see whether or not the coloring is correct. Here's another way to layout node positions to help with the visualization. With this layout, you only need to verify that there aren't any vertical edges.

def draw_coloring(G,num_colors): '''Visualizes a graph coloring. Inputs: G : a NetworkX graph object with the 'color' property set to an integer between 0 and num_colors-1 on each node num_colors : the number of colors in the coloring ''' # keep track of the number of nodes for each color counts = { j:0 for j in xrange(num_colors) } # initialize an empty position dictionary pos = {} # loop over the nodes for v in G.nodes_iter(): # determine the position from the color c = G.node[v]['color'] pos[v] = (c,counts[c]) # update the color count counts[c] += 1 # draw the graph plt.figure(figsize=(10,5)) plt.subplot(1,2,1) nx.draw(G,node_color=[d['color'] for v,d in G.nodes_iter(data=True)],width=1,edge_color='gray',cmap=plt.cm.Set3) plt.subplot(1,2,2) nx.draw(G,pos,node_color=[d['color'] for v,d in G.nodes_iter(data=True)],width=1,edge_color='gray',cmap=plt.cm.Set3) # visualize the coloring draw_coloring(G,num_colors)
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-14-a4862ca21c66> in <module>() 34 35 # visualize the coloring ---> 36 draw_coloring(G,num_colors) <ipython-input-14-a4862ca21c66> in draw_coloring(G, num_colors) 20 21 # determine the position from the color ---> 22 c = G.node[v]['color'] 23 pos[v] = (c,counts[c]) 24 KeyError: 'color'

Improving the Bound

It's not too hard to see that the maximum degree bound is overly pessimistic. Here's an example.

# create a wheel graph n = 11 G = nx.cycle_graph(n-1) pos = nx.circular_layout(G) G.add_node(n-1) G.add_edges_from([(n-1,i) for i in xrange(n-1)]) pos[n-1] = (0,0) # draw graph plt.figure(figsize=(5,5)) nx.draw(G,pos,node_color='white',width=2)
<matplotlib.figure.Figure at 0x7f05915ae650>

The graph above has chromatic number 3, which is much smaller than 11, the bound given by the maximum degree. To improve the estimate of χ(G)\chi(G), let's start by sorting the nodes according to degree. The first step is to create a list with entries (deg(a),a).

unsorted_degrees = [ (G.degree(v),v) for v in G.nodes_iter()] print unsorted_degrees
[(3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9), (10, 10)]

Now, let's sort from smallest degree to largest. (Note that this is why we chose the ordering (deg(v),v) rather than (v,deg(v)).) The sorting algorithm compares the first entries of each tuple, then the second, and so on.

sorted_degrees = sorted(unsorted_degrees) print sorted_degrees
[(3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9), (10, 10)]

If we now color the nodes in this order, we might expect to do better than the max-degree-plus-one bound. By starting with small-degree nodes first, we hope to minimize the number of color conflicts as we choose the color of later nodes. The following code shows the first few steps.

# initialize node colors to -1 (currently uncolored) for v in G.nodes_iter(): G.node[v]['color'] = -1 # look at the first node in the sorted list # (The double-underscore '__' is just a dummy variable. # We needed the degrees to sort the nodes, but now we can # discard that information.) __,v = sorted_degrees[0] print 'processing node {}'.format(v) # form a list of the colors already assigned to neighbors of this node. # (of course, there won't be any.) nbrs = set( [ G.node[w]['color'] for w in G.neighbors_iter(v) if G.node[w]['color'] != -1] ) print ' neighbor colors: ', nbrs # find the first color 0, 1, 2, ... not present in the current list new_color = 0 while new_color in nbrs: new_color += 1 print ' new node color: ', new_color # color the new node G.node[v]['color'] = new_color # look at the next node in the list __,v = sorted_degrees[1] print 'processing node {}'.format(v) # form a list of the colors already assigned to neighbors of this node. nbrs = set( [ G.node[w]['color'] for w in G.neighbors_iter(v) if G.node[w]['color'] != -1] ) print ' neighbor colors: ', nbrs # find the first color 0, 1, 2, ... not present in the current list new_color = 0 while new_color in nbrs: new_color += 1 print ' new node color: ', new_color # color the new node G.node[v]['color'] = new_color
processing node 0 neighbor colors: set([]) new node color: 0 processing node 1 neighbor colors: set([0]) new node color: 1
Exercise 2
Complete the following function to compute a graph coloring. Be sure to read the function's docstring so you know what the function is supposed to do.
def color_graph(G): '''Finds a coloring of the graph G. Inputs: G : a NetworkX graph Output: num_colors : an integer representing the number of colors used Side effects: To aid in debugging, the function will output some information as it processes the network. For a real application, you'd probably want to remove this, or make it optional. When this function completes, each node of G will have its 'color' attribute set to an integer between 0 and num_colors-1, indicating this nodes color in the coloring. ''' # initialize node colors to -1 (currently uncolored) for v in G.nodes_iter(): G.node[v]['color'] = -1 # initialize number of colors num_colors = 0 # sort nodes by degree unsorted_degrees = [ (G.degree(v),v) for v in G.nodes_iter()] sorted_degrees = sorted(unsorted_degrees) # loop over nodes for __,v in sorted_degrees: print 'processing node {}'.format(v) # form a list of the colors already assigned to neighbors of this node. nbrs = set( [ G.node[w]['color'] for w in G.neighbors_iter(v) if G.node[w]['color'] <> -1 ] ) print ' neighbor colors: ', nbrs # find the first color 0, 1, 2, ... not present in the current list new_color = 0 while new_color in nbrs: new_color += 1 print ' new node color: ', new_color # color the new node c = next(iter(possible)) # update the number of colors num_colors = max(num_colors,new_color+1) return num_colors

When you're finished, your code should be able to color the wheel with 3 colors.

num_colors = color_graph(G) draw_coloring(G,num_colors)
processing node 0 neighbor colors: set([0, 1]) new node color: 2 processing node 1 neighbor colors: set([0, 1]) new node color: 2 processing node 2 neighbor colors: set([0, 1]) new node color: 2 processing node 3 neighbor colors: set([0, 1]) new node color: 2 processing node 4 neighbor colors: set([0, 1]) new node color: 2 processing node 5 neighbor colors: set([0, 1]) new node color: 2 processing node 6 neighbor colors: set([0, 1]) new node color: 2 processing node 7 neighbor colors: set([0, 1]) new node color: 2 processing node 8 neighbor colors: set([0, 1]) new node color: 2 processing node 9 neighbor colors: set([0, 1]) new node color: 2 processing node 10 neighbor colors: set([0, 1]) new node color: 2
<matplotlib.figure.Figure at 0x7f05935d1c50>

Some tests

Your code will be compared with a correct solution to the following graphs. Here's how you can try to check your code yourself.

  1. Save your notebook!

  2. In 'Kernel' menu above, select 'Restart Kernel' and click the red 'Restart button'.
    Why? The kernel is a persistent Python interpreter that serves the Jupyter notebook. Every time you run a Jupyter cell, the notebook server passes the Python commands to the interpreter, receives any response sent back, and formats it for your web browser. Every Python object or function you define is managed in memory by this Python interpreter. That's important to keep in mind because it means that Python variables and functions will persist in kernel memory even if you erase them (intentionally or accidentally) from the notebook. This can cause your code to work correctly in your current session, but fail to work later when a "fresh" kernel is started. By restarting the kernel right now, you can check that your code is not making use of kernel objects that are no longer defined in the notebook. If, after the restart, something goes wrong, read the error messages that appear. They will tell you what is missing from your notebook.

  3. In the 'Cell' menu above, select 'Run All'. This will run all of your notebook's code, starting from the top, in your fresh new Python kernel. Check the output throughout the notebook, including in the tests below, to see if there are any errors. Also, make sure the results of the following tests look reasonable.

# graph parameters for the tests graph_params = ( (8,1,1), (5,0.5,29), (10,0.3,818), (20,0.7,8817) ) # run the tests for test_num, (n,p,seed) in enumerate(graph_params): print '*** Running Test {} ***'.format(test_num+1) print # create the graph G = nx.gnp_random_graph(n,p,seed) num_colors = color_graph(G) print 'num_colors = ', num_colors draw_coloring(G,num_colors) print print '*** Done. ***' print
*** Running Test 1 ***
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-11-f049af473890> in <module>() 11 G = nx.gnp_random_graph(n,p,seed) 12 ---> 13 num_colors = color_graph(G) 14 print 'num_colors = ', num_colors 15 draw_coloring(G,num_colors) NameError: name 'color_graph' is not defined

Wilf's Theorem

We end by making a connection with the spectral properties of the adjacency matrix. Recall that if μ1\mu_1 is the largest eigenvalue of the adjacency matrix AA, then degavgμ1. {\rm deg}_{avg}\leq \mu_1. By modifying the basic algorithm slightly Wilf's Theorem shows that the largest eigenvalue bounds the chromatic number: χ(G)μ1. \chi(G)\leq \lceil \mu_1 \rceil. The notation μ1\lceil \mu_1 \rceil means "the ceiling function" of μ1\mu_1, namely the closest integer to μ1\mu_1 from above.

Here is the idea of the proof. Suppose we find a way of assigning the numbers {1,,N}\{1,\dots,N\} to the vertices, in such a way that for every x[N]x\in[N] ([N][N] stands for {1,,N}\{1,\dots,N\}) has at most κ\kappa neighbors with smaller label. Then we can run the basic algorithm with this ordering. Assign a color to node 1, than node 2, etc...then node jj has at most κ\kappa neighbors that have already been colored, so it is enough to have κ+1\kappa+1 colors.

The claim is that it is always possible to order the nodes as above with κ=μ1\kappa=\lfloor \mu_1 \rfloor ("the floor" of μ1\mu_1).

  • Since degavgμ1{\rm deg}_{avg}\leq \mu_1, there is at least one node x1Vx_1\in V such that deg(x1)μ1{\rm deg}(x_1)\le \mu_1.

  • Assign the label NN to x1x_1

  • Consider the subgraph induced by Vx1V\setminus{x_1} (remove x1x_1 from VV and remove all edges incident to x1x_1 from EE).

  • In the HW you are asked to prove the μ1\mu_1 also dominates the average degree of any induced subgraph of GG. So degavg(G(V{x1}))μ1. {\rm deg}_{avg}(G(V\setminus\{x_1\}))\le \mu_1.

  • In particular, one can pick a new node x2V{x1}x_2\in V\setminus\{x_1\} whose degree in G(V{x1})G(V\setminus\{x_1\}) is less or equal μ1\mu_1.

  • Assign the label N1N-1 to x2x_2, and repeat the process.

  • Note that the property of having at most μ1\mu_1 neighbors with lower label is preserved at each step. So this produces the ordering that we were looking for.

Hoffman's Bound

There is also a lower bound for the chromatic numeber due to Hoffman: χ(G)1μ1μN. \chi(G)\ge 1-\frac{\mu_1}{\mu_N}. The proof is more complicated and uses block partitioning of the adjacency matrix.

Note however, that when GG is bipartite, the lower bound is tight.

(Also note that we always have μN<0\mu_N<0 because Tr(A)=0{\rm Tr}(A)=0.)

Optional Challenge Exercise
Implement Wilf's algorithm by adapting the routines described in this notebook.