Jupyter notebook M725F17/05 - Graph Coloring/Graph Coloring.ipynb
Graph Coloring
created 2/8/2017, updated 2/12/2017
The graph coloring problem
Graph coloring is a famous type of problem in graph theory. The most basic problem of this type is the following. Suppose you are given a network with vertices and edges along with different colors. Is it possible to assign a color to each vertex in such a way that no two vertices connected by an edge have the same color. For example, a triangle (the complete graph on 3 vertices, ) can be colored with but not with . The smallest number of colors required to color a given graph is called its chromatic number, . So . Also, a graph is bipartite if and only if . The famous Four Color Theorem states that every planar graph can be colored with colors.
One obvious upper bound on the chromatic number of a general graph is , just color each vertex a unique color.
A slightly better bound is that is, the chromatic number of a graph is no larger than 1 plus the maximum degree of any vertex in . One way to prove this is via a simple algorithm.
Basic algorithm:
Whenever you must choose a color for a vertex that is still blank, simply pick a color that is different from any colors already assigned to the neighbors.
To begin, below we implement this basic algorithm. Read the code below line by line and make sure you understand what is happening. If you are unsure, don't hesitate to insert a new cell from the toolbar under the word JUPYTER above, and print out some of the objects that are being called.
It is easier to implement set where we need to check uniqness of elements and we need to chack a new element is in the set or not. It is more suitable for set operations, for example union, intersection etc.
A little better visualization
For some graphs, it can be a little tricky to see whether or not the coloring is correct. Here's another way to layout node positions to help with the visualization. With this layout, you only need to verify that there aren't any vertical edges.
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-14-a4862ca21c66> in <module>()
34
35 # visualize the coloring
---> 36 draw_coloring(G,num_colors)
<ipython-input-14-a4862ca21c66> in draw_coloring(G, num_colors)
20
21 # determine the position from the color
---> 22 c = G.node[v]['color']
23 pos[v] = (c,counts[c])
24
KeyError: 'color'
Improving the Bound
It's not too hard to see that the maximum degree bound is overly pessimistic. Here's an example.
The graph above has chromatic number 3, which is much smaller than 11, the bound given by the maximum degree. To improve the estimate of , let's start by sorting the nodes according to degree. The first step is to create a list with entries (deg(a),a).
Now, let's sort from smallest degree to largest. (Note that this is why we chose the ordering (deg(v),v) rather than (v,deg(v)).) The sorting algorithm compares the first entries of each tuple, then the second, and so on.
If we now color the nodes in this order, we might expect to do better than the max-degree-plus-one bound. By starting with small-degree nodes first, we hope to minimize the number of color conflicts as we choose the color of later nodes. The following code shows the first few steps.
When you're finished, your code should be able to color the wheel with 3 colors.
Some tests
Your code will be compared with a correct solution to the following graphs. Here's how you can try to check your code yourself.
Save your notebook!
In 'Kernel' menu above, select 'Restart Kernel' and click the red 'Restart button'.
Why? The kernel is a persistent Python interpreter that serves the Jupyter notebook. Every time you run a Jupyter cell, the notebook server passes the Python commands to the interpreter, receives any response sent back, and formats it for your web browser. Every Python object or function you define is managed in memory by this Python interpreter. That's important to keep in mind because it means that Python variables and functions will persist in kernel memory even if you erase them (intentionally or accidentally) from the notebook. This can cause your code to work correctly in your current session, but fail to work later when a "fresh" kernel is started. By restarting the kernel right now, you can check that your code is not making use of kernel objects that are no longer defined in the notebook. If, after the restart, something goes wrong, read the error messages that appear. They will tell you what is missing from your notebook.In the 'Cell' menu above, select 'Run All'. This will run all of your notebook's code, starting from the top, in your fresh new Python kernel. Check the output throughout the notebook, including in the tests below, to see if there are any errors. Also, make sure the results of the following tests look reasonable.
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-11-f049af473890> in <module>()
11 G = nx.gnp_random_graph(n,p,seed)
12
---> 13 num_colors = color_graph(G)
14 print 'num_colors = ', num_colors
15 draw_coloring(G,num_colors)
NameError: name 'color_graph' is not defined
Wilf's Theorem
We end by making a connection with the spectral properties of the adjacency matrix. Recall that if is the largest eigenvalue of the adjacency matrix , then By modifying the basic algorithm slightly Wilf's Theorem shows that the largest eigenvalue bounds the chromatic number: The notation means "the ceiling function" of , namely the closest integer to from above.
Here is the idea of the proof. Suppose we find a way of assigning the numbers to the vertices, in such a way that for every ( stands for ) has at most neighbors with smaller label. Then we can run the basic algorithm with this ordering. Assign a color to node 1, than node 2, etc...then node has at most neighbors that have already been colored, so it is enough to have colors.
The claim is that it is always possible to order the nodes as above with ("the floor" of ).
Since , there is at least one node such that .
Assign the label to
Consider the subgraph induced by (remove from and remove all edges incident to from ).
In the HW you are asked to prove the also dominates the average degree of any induced subgraph of . So
In particular, one can pick a new node whose degree in is less or equal .
Assign the label to , and repeat the process.
Note that the property of having at most neighbors with lower label is preserved at each step. So this produces the ordering that we were looking for.
Hoffman's Bound
There is also a lower bound for the chromatic numeber due to Hoffman: The proof is more complicated and uses block partitioning of the adjacency matrix.
Note however, that when is bipartite, the lower bound is tight.
(Also note that we always have because .)