📚 The CoCalc Library - books, templates and other resources
License: OTHER
% LaTeX source for ``Think Complexity, 2nd edition''1% Copyright (c) 2016 Allen B. Downey.23% Permission is granted to copy, distribute, transmit and adapt4% this work under a Creative Commons5% Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)6% https://creativecommons.org/licenses/by-nc-sa/4.0/78% If you are interested in distributing a commercial version of this9% work, please contact Allen Downey.1011% The LaTeX source for this book is available from12% https://github.com/AllenDowney/ThinkComplexity21314% TODO: Fix PDFLATEX options.1516\documentclass[12pt]{book}1718\title{Think Complexity}19\author{Allen B. Downey}2021\newcommand{\thetitle}{Think Complexity}22\newcommand{\thesubtitle}{Exploring Complexity Science in Python}23\newcommand{\theauthors}{Allen B. Downey}24\newcommand{\theversion}{2.6.2}2526%%%% Both LATEX and PLASTEX2728\usepackage{graphicx}29\usepackage{hevea}30\usepackage{makeidx}31\usepackage{setspace}3233\makeindex3435% automatically index glossary terms36\newcommand{\term}[1]{%37\item[#1:]\index{#1}}3839\usepackage{amsmath}40\usepackage{amsthm}4142% format end of chapter excercises43\newtheoremstyle{exercise}44{12pt} % space above45{12pt} % space below46{} % body font47{} % indent amount48{\bfseries} % head font49{} % punctuation50{12pt} % head space51{} % custom head52\theoremstyle{exercise}53\newtheorem{exercise}{Exercise}[chapter]5455\usepackage{afterpage}5657\newcommand\blankpage{%58\null59\thispagestyle{empty}%60\addtocounter{page}{-1}%61\newpage}6263\newif\ifplastex64\plastexfalse6566%%%% PLASTEX ONLY67\ifplastex6869\usepackage{localdef}7071\usepackage{url}7273\newcount\anchorcnt74\newcommand*{\Anchor}[1]{%75\@bsphack%76\Hy@GlobalStepCount\anchorcnt%77\edef\@currentHref{anchor.\the\anchorcnt}%78\Hy@raisedlink{\hyper@anchorstart{\@currentHref}\hyper@anchorend}%79\M@gettitle{}\label{#1}%80\@esphack%81}8283% code listing environments:84% we don't need these for plastex because they get replaced85% by preprocess.py86%\newenvironment{code}{\begin{code}}{\end{code}}87%\newenvironment{stdout}{\begin{code}}{\end{code}}8889% inline syntax formatting90\newcommand{\py}{\verb}%}9192%%%% LATEX ONLY93\else9495\input{latexonly}9697\newcommand{\myrightarrow}{\ensuremath \rightarrow}9899\fi100101%%%% END OF PREAMBLE102\begin{document}103104\frontmatter105106%%%% PLASTEX ONLY107\ifplastex108109\maketitle110111%%%% LATEX ONLY112\else113114\begin{latexonly}115116% half title --- --- --- --- --- --- --- --- --- --- ---117118\thispagestyle{empty}119120\begin{flushright}121\vspace*{2.0in}122123\begin{spacing}{3}124{\huge \thetitle}125\end{spacing}126127\vspace{0.25in}128129Version \theversion130131\vfill132133\end{flushright}134135% verso --- --- --- --- --- --- --- --- --- --- --- --- ---136137\afterpage{\blankpage}138139%\clearemptydoublepage140%\pagebreak141%\thispagestyle{empty}142%\vspace*{6in}143144% title page --- --- --- --- --- --- --- --- --- --- --- ---145146\pagebreak147\thispagestyle{empty}148149\begin{flushright}150\vspace*{2.0in}151152\begin{spacing}{3}153{\huge \thetitle}154\end{spacing}155156\vspace{0.25in}157158Version \theversion159160\vspace{1in}161162163{\Large164\theauthors \\165}166167168\vspace{0.5in}169170{\Large Green Tea Press}171172{\small Needham, Massachusetts}173174%\includegraphics[width=1in]{figs/logo1.eps}175\vfill176177\end{flushright}178179180% copyright --- --- --- --- --- --- --- --- --- --- --- --- ---181182\pagebreak183\thispagestyle{empty}184185Copyright \copyright ~2016 \theauthors.186187188189\vspace{0.2in}190191\begin{flushleft}192Green Tea Press \\1939 Washburn Ave \\194Needham MA 02492195\end{flushleft}196197Permission is granted to copy, distribute, transmit and adapt198this work under a Creative Commons199Attribution-NonCommercial-ShareAlike 4.0 International License:200\url{https://thinkcomplex.com/license}.201202If you are interested in distributing a commercial version of this203work, please contact the author.204205The \LaTeX\ source for this book is available from206207\begin{code}208https://github.com/AllenDowney/ThinkComplexity2209\end{code}210211% table of contents --- --- --- --- --- --- --- --- ---212213\cleardoublepage214\setcounter{tocdepth}{1}215\tableofcontents216217\end{latexonly}218219220% HTML title page --- --- --- --- --- --- --- --- --- --- ---221222\begin{htmlonly}223224\vspace{1em}225226{\Large \thetitle}227228{\large \theauthors}229230Version \theversion231232\vspace{1em}233234Copyright \copyright ~2016 \theauthors.235236Permission is granted to copy, distribute, and/or modify this work237under the terms of the Creative Commons238Attribution-NonCommercial-ShareAlike 4.0 International License, which is239available at \url{https://thinkcomplex.com/license}.240241\vspace{1em}242243\setcounter{chapter}{-1}244245\end{htmlonly}246247% END OF THE PART WE SKIP FOR PLASTEX248\fi249250\chapter{Preface}251\label{preface}252253Complexity science is an interdisciplinary254field --- at the intersection of mathematics, computer science and255natural science --- that focuses on {\bf complex systems},256which are systems with many interacting components.257258One of the core tools of complexity science is discrete models,259including networks and graphs, cellular automatons, and agent-based260simulations. These tools are useful in the natural and social sciences,261and sometimes in arts and humanities.262263For an overview of complexity science, see264\url{https://thinkcomplex.com/complex}.265266\index{complexity science}267\index{complex systems}268269270271Why should you learn about complexity science? Here are a few reasons:272273\begin{itemize}274275\item Complexity science is useful, especially for explaining why natural and social systems behave the way they do. Since Newton, math-based physics has focused on systems with small numbers of components and simple interactions. These models are effective for some applications, like celestial mechanics, and less useful for others, like economics. Complexity science provides a diverse and adaptable modeling toolkit.276277\item Many of the central results of complexity science are surprising; a recurring theme of this book is that simple models can produce complicated behavior, with the corollary that we can sometimes explain complicated behavior in the real world using simple models.278279\item As I explain in Chapter~\ref{overview}, complexity science is at the center of a slow shift in the practice of science and a change in what we consider science to be.280281\item Studying complexity science provides an opportunity to learn about diverse physical and social systems, to develop and apply programming skills, and to think about fundamental questions in the philosophy of science.282283\end{itemize}284285By reading this book and working on the exercises you will have a chance to explore topics and ideas you might not encounter otherwise, practice programming in Python, and learn more about data structures and algorithms.286287Features of this book include:288289\begin{description}290291\item[Technical details] Most books about complexity science292are written for a popular audience. They leave out293technical details, which is frustrating for people who can handle294them. This book presents the code, the math, and the explanations295you need to understand how the models work.296297\item[Further reading] Throughout the book, I include pointers to298further reading, including original papers (most of which are299available electronically) and related articles from Wikipedia and300other sources.301302\item[Jupyter notebooks] For each chapter I provide a Jupyter notebook303that includes the code from the chapter, additional examples, and304animations that let you see the models in action.305306\item[Exercises and solutions] At the end of each chapter I suggest307exercises you might want to work on, with solutions.308309\end{description}310311For most of the links in this book I use URL redirection. This mechanism has the drawback of hiding the link destination, but it makes the URLs shorter and less obtrusive. Also, and more importantly, it allows me to update the links without updating the book. If you find a broken link, please let me know and I will change the redirection.312313314\section{Who is this book for?}315316The examples and supporting code for this book are in Python. You317should know core Python and be familiar with its object-oriented features,318specifically using and defining classes.319320\index{Python}321322If you are not already familiar with Python, you might want to start323with {\it Think Python}, which is appropriate for people who have never programmed before. If you have programming experience in another language, there are many good Python books to choose from, as well as online resources.324325I use NumPy, SciPy, and NetworkX throughout the book. If you are familiar with these libraries already, that's great, but I will also explain them when they appear.326327\index{NumPy}328\index{SciPy}329\index{NetworkX}330331I assume that the reader knows some mathematics: I use logarithms in several places, and vectors in one example. But that's about it.332333334\section{Changes from the first edition}335336For the second edition, I added two chapters, one on evolution, the other on the evolution of cooperation.337338In the first edition, each chapter presented background on a topic and suggested experiments the reader could run. For the second edition, I have done those experiments. Each chapter presents the implementation and results as a worked example, then suggests additional experiments for the reader.339340For the second edition, I replaced some of my own code with standard libraries like NumPy and NetworkX. The result is more concise and more efficient, and it gives readers a chance to learn these libraries.341342Also, the Jupyter notebooks are new. For every chapter there are two notebooks: one contains the code from the chapter, explanatory text, and exercises; the other contains solutions to the exercises.343344\index{Jupyter}345346Finally, all supporting software has been updated to Python 3 (but most of it runs unmodified in Python 2).347348349\section{Using the code}350\label{code}351352353All code used in this book is available from a Git repository on GitHub:354\url{https://thinkcomplex.com/repo}.355If you are not familiar with Git, it is a356version control system that allows you to keep track of the files that357make up a project. A collection of files under Git's control is358called a ``repository''. GitHub is a hosting service that provides359storage for Git repositories and a convenient web interface.360361\index{repository}362\index{Git}363\index{GitHub}364365The GitHub homepage for my repository provides several ways to366work with the code:367368\begin{itemize}369370\item You can create a copy of my repository by pressing the {\sf371Fork} button in the upper right. If you don't already have a GitHub372account, you'll need to create one. After forking, you'll have your373own repository on GitHub that you can use to keep track of code you374write while working on this book. Then you can clone the repo,375which means that you copy the files to your computer.376377\index{fork}378379\item Or you can clone my repository without forking; that is, you can380make a copy of my repo on your computer. You don't need a GitHub381account to do this, but you won't be able to write your changes back382to GitHub.383384\index{clone}385386\item If you don't want to use Git at all, you can download the files387in a Zip file using the green button that says ``Clone or download''.388389\end{itemize}390391I developed this book using Anaconda from Continuum Analytics, which392is a free Python distribution that includes all the packages you'll393need to run the code (and lots more). I found Anaconda easy to394install. By default it does a user-level installation, not395system-level, so you don't need administrative privileges. And it396supports both Python 2 and Python 3. You can download Anaconda from397\url{https://continuum.io/downloads}.398399\index{Anaconda}400401The repository includes both Python scripts and Jupyter402notebooks. If you have not used Jupyter before, you can read about403it at \url{https://jupyter.org}.404405\index{Jupyter}406407There are three ways you can work with the Jupyter notebooks:408409\begin{description}410411\item[Run Jupyter on your computer]412413If you installed Anaconda, you can install Jupyter by running the following command in a terminal or Command Window:414415\begin{verbatim}416$ conda install jupyter417\end{verbatim}418419Before you launch Jupyter, you should \py{cd} into the directory that contains the code:420421\begin{verbatim}422$ cd ThinkComplexity2/code423\end{verbatim}424425And then start the Jupyter server:426427\begin{verbatim}428$ jupyter notebook429\end{verbatim}430431When you start the server, it should launch your default web browser432or create a new tab in an open browser window. Then you can open433and run the notebooks.434435\item[Run Jupyter on Binder]436437Binder is a service that runs Jupyter in a virtual machine. If you438follow this link, \url{https://thinkcomplex.com/binder},439you should get a Jupyter home page with the notebooks for this book440and the supporting data and scripts.441442\index{Binder}443444You can run the scripts and modify them to run your own code, but the445virtual machine you run them in is temporary.446If you leave it idle, the virtual machine disappears along with any changes you made.447448\item[View notebooks on GitHub]449450GitHub provides a view of the notebooks you can451can use to read the notebooks and see the results I452generated, but you won't be able to modify or run the code.453454\end{description}455456Good luck, and have fun!457458459\begin{flushleft}460Allen B. Downey \\461462Professor of Computer Science \\463464Olin College of Engineering \\465466Needham, MA467\end{flushleft}468469470\section*{Contributor List}471472\index{contributors}473474If you have a suggestion or correction, please send email to475{\tt downey@allendowney.com}. If I make a change based on your476feedback, I will add you to the contributor list477(unless you ask to be omitted).478\index{contributors}479480Let me know what version of the book you are working with, and481what format. If you include at least part of the sentence the482error appears in, that makes it easy for me to search. Page and483section numbers are fine, too, but not quite as easy to work with.484Thanks!485486\small487488\begin{itemize}489490\item John Harley, Jeff Stanton, Colden Rouleau and491Keerthik Omanakuttan are Computational Modeling students who492pointed out typos.493494\item Jose Oscar Mur-Miranda found several typos.495496\item Phillip Loh, Corey Dolphin, Noam Rubin and Julian Ceipek497found typos and made helpful suggestions.498499%\item I am grateful to the program committee that read and selected500%the case studies included in this book:501%Sarah Spence Adams,502%John Geddes,503%Stephen Holt,504%Vincent Manno,505%Robert Martello,506%Amon Millner,507%Jos\'{e} Oscar Mur-Miranda,508%Mark Somerville, and509%Ursula Wolz.510511\item Sebastian Sch\"{o}ner sent two pages of corrections!512513\item Philipp Marek sent a number of corrections.514515\item Jason Woodard co-taught Complexity Science with me at Olin College, introduced me to NK models, and made many helpful suggestions and corrections.516517\item Davi Post sent several corrections and suggestions.518519\item Graham Taylor sent a pull request on GitHub that fixed many typos.520521% ENDCONTRIB522523\end{itemize}524525I would especially like to thank the technical reviewers, Vincent Knight and Eric Ma, who made many helpful suggestions, and the copy editor, Charles Roumeliotis, who caught many errors and inconsistencies.526527Other people who reported errors include528Richard Hollands,529Muhammad Najmi bin Ahmad Zabidi,530Alex Hantman, and531Jonathan Harford.532533534535536\normalsize537538539540\mainmatter541542\chapter{Complexity Science}543\label{overview}544545Complexity science is relatively new; it became recognizable as a546field, and was given a name, in the 1980s. But its newness is not because it547applies the tools of science to a new subject, but because it uses548different tools, allows different kinds of work, and ultimately549changes what we mean by ``science''.550551\index{complexity science}552553To demonstrate the difference, I'll start with an example of classical554science: suppose someone asks you why planetary orbits are555elliptical. You might invoke Newton's law of universal556gravitation and use it to write a differential equation that describes557planetary motion. Then you can solve the differential equation and558show that the solution is an ellipse. QED!559560\index{gravitation}561\index{planetary motion}562563Most people find this kind of explanation satisfying. It includes a564mathematical derivation --- so it has some of the rigor of a proof --- and565it explains a specific observation, elliptical orbits, by appealing to566a general principle, gravitation.567568\index{proof}569\index{natural law}570571Let me contrast that with a different kind of explanation. Suppose572you move to a city like Detroit that is racially segregated, and you573want to know why it's like that. If you do some research, you might574find a paper by Thomas Schelling called ``Dynamic Models of575Segregation'', which proposes a simple model of racial segregation:576577\index{segregation}578\index{Detroit}579580Here is my description of the model, from Chapter~\ref{agent-based}:581582\begin{quote}583The Schelling model of the city is an array of cells where each cell584represents a house. The houses are occupied by two kinds of585``agents'', labeled red and blue, in roughly equal numbers. About58610\% of the houses are empty.587588\index{Schelling, Thomas}589\index{agent}590591At any point in time, an agent might be happy or unhappy, depending on592the other agents in the neighborhood. In one version of the model,593agents are happy if they have at least two neighbors like themselves,594and unhappy if they have one or zero.595596\index{agent-based model}597598The simulation proceeds by choosing an agent at random and checking599to see whether it is happy. If so, nothing happens; if not,600the agent chooses one of the unoccupied cells at601random and moves.602\end{quote}603604If you start with a simulated city that is entirely unsegregated and605run the model for a short time, clusters of similar agents appear. As606time passes, the clusters grow and coalesce until there are a small607number of large clusters and most agents live in homogeneous608neighborhoods.609610\index{segregation}611612The degree of segregation in the model is surprising, and it suggests613an explanation of segregation in real cities. Maybe Detroit is614segregated because people prefer not to be greatly outnumbered and615will move if the composition of their neighborhoods makes them616unhappy.617618\index{racism}619\index{xenophobia}620621Is this explanation satisfying in the same way as the explanation of622planetary motion? Many people would say not, but why?623624Most obviously, the Schelling model is highly abstract, which is to625say not realistic. So you might be tempted to say that people are more complicated than planets. But that can't be right. After all, some planets have people on them, so they have to be more complicated than people.626627\index{abstract model}628629Both systems are complicated, and both models are based on630simplifications. For example, in the model of planetary motion we631include forces between the planet and its sun, and ignore interactions632between planets. In Schelling's model, we include individual decisions633based on local information, and ignore every other aspect of human behavior.634635\index{simplification}636637But there are differences of degree.638For planetary motion, we can defend639the model by showing that the forces we ignore are smaller than the640ones we include. And we can extend the model to include other641interactions and show that the effect is small. For Schelling's model642it is harder to justify the simplifications.643644\index{justification}645646Another difference is that Schelling's model doesn't appeal to any647physical laws, and it uses only simple computation, not mathematical648derivation. Models like Schelling's don't look like classical649science, and many people find them less compelling, at least at first.650But as I will try to demonstrate, these models do useful work,651including prediction, explanation, and design. One of the goals of652this book is to explain how.653654\index{modeling}655656657\section{The changing criteria of science}658659Complexity science is not just a different set of models; it is also a660gradual shift in the criteria models are judged by, and in the kinds661of models that are considered acceptable.662663\index{complexity science}664665For example, classical models tend to be law-based, expressed in the666form of equations, and solved by mathematical derivation. Models that667fall under the umbrella of complexity are often rule-based,668expressed as computations, and simulated rather than analyzed.669670Not everyone finds these models satisfactory. For example, in671{\em Sync}, Steven Strogatz writes about his model of spontaneous672synchronization in some species of fireflies. He presents a673simulation that demonstrates the phenomenon, but then writes:674675\index{Strogatz, Steven}676\index{Sync@{\it Sync}}677\index{fireflies}678\index{synchronization}679680\begin{quote}681I repeated the simulation dozens of times, for other random682initial conditions and for other numbers of oscillators. Sync683every time. [...] The challenge now was to prove it. Only an684ironclad proof would demonstrate, in a way that no computer ever685could, that sync was inevitable; and the best kind of proof would686clarify {\em why} it was inevitable.687\end{quote}688689Strogatz is a mathematician, so his enthusiasm for proofs is690understandable, but his proof doesn't address what is, to me, the most691interesting part of the phenomenon. In order to prove that ``sync was692inevitable'', Strogatz makes several simplifying assumptions, in693particular that each firefly can see all the others.694695\index{proof}696697In my opinion, it is more interesting to explain how an entire valley698of fireflies can synchronize {\em despite the fact that they cannot699all see each other}. How this kind of global behavior emerges from700local interactions is the subject of Chapter~\ref{agent-based}.701Explanations of these phenomena often use agent-based models, which702explore (in ways that would be difficult or impossible with703mathematical analysis) the conditions that allow or prevent704synchronization.705706I am a computer scientist, so my enthusiasm for computational models707is probably no surprise. I don't mean to say that Strogatz is wrong,708but rather that people have different opinions about what questions to709ask and what tools to use to answer them. These opinions are based710on value judgments, so there is no reason to expect agreement.711712\index{computational model}713714Nevertheless, there is rough consensus among scientists715about which models are considered good science, and which others716are fringe science, pseudoscience, or not science at all.717718\index{fringe science}719\index{pseudoscience}720721A central thesis of this book is that the722criteria this consensus is based on change over time, and that723the emergence of complexity science reflects a gradual shift in724these criteria.725726727\section{The axes of scientific models}728729I have described classical models as based on physical laws, expressed730in the form of equations, and solved by mathematical analysis;731conversely, models of complex systems are often based on simple732rules and implemented as computations.733734\index{criteria for models}735736We can think of this trend as a shift over time along two axes:737738\begin{description}739740\item[Equation-based \myrightarrow~simulation-based] \quad741742\item[Analysis \myrightarrow~computation] \quad743744\end{description}745746Complexity science is different in several other ways. I present them747here so you know what's coming, but some of them might not make sense748until you have seen the examples later in the book.749750\begin{description}751752\item[Continuous \myrightarrow~discrete] Classical models tend to be753based on continuous mathematics, like calculus; models of complex754systems are often based on discrete mathematics, including graphs and755cellular automatons.756757\index{continuous}758\index{discrete}759760\item[Linear \myrightarrow~nonlinear] Classical models are often761linear, or use linear approximations to nonlinear systems;762complexity science is more friendly to nonlinear models.763764\index{linear}765\index{nonlinear}766767\item[Deterministic \myrightarrow~stochastic] Classical models are768usually deterministic, which may reflect underlying philosophical769determinism, discussed in Chapter~\ref{automatons}; complex models770often include randomness.771772\index{deterministic}773\index{stochastic}774775\item[Abstract \myrightarrow~detailed] In classical models, planets are776point masses, planes are frictionless, and cows are777spherical (see \url{https://thinkcomplex.com/cow}).778Simplifications like these are often necessary for analysis,779but computational models can be more realistic.780781\index{spherical cow}782\index{cow, spherical}783784\item[One, two \myrightarrow~many] Classical models are often limited to785small numbers of components. For example, in celestial mechanics the786two-body problem can be solved analytically; the three-body problem787cannot. Complexity science often works with large numbers of components and larger number of interactions.788789\index{one, two, many}790791\item[Homogeneous \myrightarrow~heterogeneous] In classical models, the792components and interactions tend to be identical; complex models more often793include heterogeneity.794795\index{homogeneous}796\index{composite}797798\end{description}799800These are generalizations, so we should not take them too seriously.801And I don't mean to deprecate classical science. A more complicated802model is not necessarily better; in fact, it is usually worse.803804And I don't mean to say that these changes are abrupt or complete.805Rather, there is a gradual migration in the frontier of what is806considered acceptable, respectable work. Some tools that used to be807regarded with suspicion are now common, and some models that were808widely accepted are now regarded with scrutiny.809810For example, when Appel and Haken proved the four-color theorem in8111976, they used a computer to enumerate 1,936 special cases that were,812in some sense, lemmas of their proof. At the time, many813mathematicians did not consider the theorem truly proved. Now814computer-assisted proofs are common and generally (but not815universally) accepted.816817\index{Appel, Kenneth}818\index{Haken, Wolfgang}819\index{four-color theorem}820821Conversely, a substantial body of economic analysis is based on a822model of human behavior called ``Economic man'', or, with tongue in823cheek, {\it Homo economicus}. Research based on this model was824highly regarded for several decades, especially if it involved825mathematical virtuosity. More recently, this model is treated with826skepticism, and models that include imperfect information and827bounded rationality are hot topics.828829\index{economic man}830\index{Homo economicus}831\index{economics}832833834\section{Different models for different purposes}835836Complex models are often appropriate for different purposes and837interpretations:838839\index{complex model}840841\begin{description}842843\item[Predictive \myrightarrow~explanatory] Schelling's model844of segregation might shed light on a complex social phenomenon, but845it is not useful for prediction. On the other hand, a simple model846of celestial mechanics can predict solar eclipses, down to the second,847years in the future.848849\index{predictive model}850\index{explanatory model}851852\item[Realism \myrightarrow~instrumentalism] Classical models lend853themselves to a realist interpretation; for example, most people854accept that electrons are real things that exist. Instrumentalism855is the view that models can be useful even if the entities they856postulate don't exist. George Box wrote what might be the motto of857instrumentalism: ``All models are wrong, but some are useful."858859\index{realism}860\index{instrumentalism}861862\item[Reductionism \myrightarrow~holism] Reductionism is the view that863the behavior of a system can be explained by understanding its864components. For example, the periodic table of the elements is a865triumph of reductionism, because it explains the chemical behavior866of elements with a model of electrons in atoms. Holism867is the view that some phenomena that appear at the system level do868not exist at the level of components, and cannot be explained in869component-level terms.870871\index{reductionism}872\index{holism}873874\end{description}875876We get back to explanatory models in Chapter~\ref{scale-free},877instrumentalism in Chapter~\ref{lifechap}, and holism in Chapter~\ref{soc}.878879880\section{Complexity engineering}881882I have been talking about complex systems in the context of science,883but complexity is also a cause, and effect, of changes in engineering884and the design of social systems:885886\index{engineering}887888\begin{description}889890\item[Centralized \myrightarrow~decentralized] Centralized systems are891conceptually simple and easier to analyze, but decentralized systems892can be more robust. For example, in the World Wide Web clients send893requests to centralized servers; if the servers are down, the894service is unavailable. In peer-to-peer networks, every node is895both a client and a server. To take down the service, you have to896take down {\em every} node.897898\index{centralized}899\index{decentralized}900\index{client-server architecture}901\index{peer-to-peer architecture}902903\item[One-to-many \myrightarrow~many-to-many] In many communication904systems, broadcast services are being augmented, and sometimes905replaced, by services that allow users to communicate with each906other and create, share, and modify content.907908\index{broadcast service}909910\item[Top-down \myrightarrow~bottom-up] In social, political and911economic systems, many activities that would normally be centrally912organized now operate as grassroots movements. Even armies, which913are the canonical example of hierarchical structure, are moving914toward devolved command and control.915916\index{top-down}917\index{bottom-up}918\index{grassroots}919920\item[Analysis \myrightarrow~computation] In classical engineering,921the space of feasible designs is limited by our capability for922analysis. For example, designing the Eiffel Tower was possible923because Gustave Eiffel developed novel analytic techniques, in924particular for dealing with wind load. Now tools for computer-aided925design and analysis make it possible to build almost anything that926can be imagined. Frank Gehry's Guggenheim Museum Bilbao is my927favorite example.928929\index{analysis}930\index{computation}931\index{Eiffel Tower}932\index{Eiffel, Gustave}933\index{Gehry, Frank}934\index{Guggenheim Museum Bilbao}935936\item[Isolation \myrightarrow~interaction] In classical engineering,937the complexity of large systems is managed by isolating components938and minimizing interactions. This is still an important engineering939principle; nevertheless, the availability of computation makes940it increasingly feasible to design systems with complex interactions941between components.942943\index{isolation}944\index{interaction}945946\item[Design \myrightarrow~search] Engineering is sometimes described947as a search for solutions in a landscape of possible designs.948Increasingly, the search process can be automated. For example,949genetic algorithms explore large design spaces and discover950solutions human engineers would not imagine (or like). The ultimate951genetic algorithm, evolution, notoriously generates designs that952violate the rules of human engineering.953954\index{design}955\index{search}956957\end{description}958959960\section{Complexity thinking}961962We are getting farther afield now, but the shifts I am postulating963in the criteria of scientific modeling are related to 20th century964developments in logic and epistemology.965966\index{logic}967\index{epistemology}968969\begin{description}970971\item[Aristotelian logic \myrightarrow~many-valued logic] In972traditional logic, any proposition is either true or false. This973system lends itself to math-like proofs, but fails (in dramatic974ways) for many real-world applications. Alternatives include975many-valued logic, fuzzy logic, and other systems designed to handle976indeterminacy, vagueness, and uncertainty. Bart977Kosko discusses some of these systems in {\em Fuzzy Thinking}.978979\index{Aristotelian logic}980\index{many-valued logic}981\index{Kosko, Bart}982\index{Fuzzy Thinking@{\it Fuzzy Thinking}}983\index{uncertainty}984985\item[Frequentist probability \myrightarrow~Bayesianism] Bayesian986probability has been around for centuries, but was not widely used987until recently, facilitated by the availability of cheap computation988and the reluctant acceptance of subjectivity989in probabilistic claims. Sharon Bertsch McGrayne presents this990history in {\em The Theory That Would Not Die}.991992\index{frequentist}993\index{Bayesian}994\index{McGrayne, Sharon Bertsch}995\index{Theory That Would Not Die, The@{\it The Theory That Would Not Die}}996997\item[Objective \myrightarrow~subjective] The Enlightenment, and998philosophic modernism, are based on belief in objective truth, that999is, truths that are independent of the people that hold them. 20th1000century developments including quantum mechanics, G\"{o}del's1001Incompleteness Theorem, and Kuhn's study of the history of science1002called attention to seemingly unavoidable subjectivity in1003even ``hard sciences'' and mathematics. Rebecca Goldstein presents1004the historical context of G\"{o}del's proof in {\it Incompleteness}.10051006\index{objective}1007\index{subjective}1008\index{Kuhn, Thomas}1009\index{Godel's Incompleteness Theorem@G\"{o}del's Incompleteness Theorem}1010\index{incompleteness}1011\index{Goldstein, Rebecca}1012\index{Incompleteness@{\it Incompleteness}}10131014\item[Physical law \myrightarrow~theory \myrightarrow~model] Some1015people distinguish between laws, theories, and models. Calling1016something a ``law'' implies that it is objectively true and1017immutable; ``theory'' suggests that it is subject to revision; and1018``model'' concedes that it is a subjective choice based on1019simplifications and approximations.10201021\index{physical law}1022\index{theory}1023\index{model}10241025I think they are all the same thing. Some concepts that are called1026laws are really definitions; others are, in effect, the1027assertion that a certain model predicts or explains the behavior of a system1028particularly well. We come back to the nature of physical laws in1029Section~\ref{model1}, Section~\ref{model3} and Section~\ref{model2}.10301031%TODO: Check how these refs look in the O'Reilly version.10321033\item[Determinism \myrightarrow~indeterminism] Determinism is the view1034that all events are caused, inevitably, by prior events. Forms of1035indeterminism include randomness, probabilistic causation, and1036fundamental uncertainty. We come back to this1037topic in Section~\ref{determinism} and Section~\ref{freewill}10381039\index{determinism}1040\index{indeterminism}1041\index{free will}10421043\end{description}10441045These trends are not universal or complete, but the center of1046opinion is shifting along these axes. As evidence, consider the1047reaction to Thomas Kuhn's {\em The Structure of Scientific1048Revolutions}, which was reviled when it was published and is1049now considered almost uncontroversial.10501051\index{Kuhn, Thomas}1052\index{Structure of Scientific Revolutions@{\it The Structure of Scientific Revolutions}}10531054These trends are both cause and effect of complexity science. For1055example, highly abstracted models are more acceptable now because of1056the diminished expectation that there should be a unique, correct model1057for every system. Conversely, developments in complex systems1058challenge determinism and the related concept of physical law.10591060This chapter is an overview of the themes coming up in the book, but1061not all of it will make sense before you see the examples. When you1062get to the end of the book, you might find it helpful to read this1063chapter again.106410651066\chapter{Graphs}1067\label{graphs}10681069\newcommand{\Erdos}{Erd\H{o}s}1070\newcommand{\Renyi}{R\'{e}nyi}10711072The next three chapters are about systems made up of components and connections between components. For example, in a social network, the components are people and connections represent friendships, business relationships, etc. In an ecological food web, the components are species and the connections represent predator-prey relationships.10731074In this chapter, I introduce NetworkX, a Python package for building1075models of these systems. We start with the \Erdos-\Renyi~model,1076which has interesting mathematical properties. In the next1077chapter we move on to models that are more useful for explaining1078real-world systems.10791080The code for this chapter is in {\tt chap02.ipynb} in the repository1081for this book. More information about working with the code is1082in Section~\ref{code}.108310841085\section{What is a graph?}10861087\begin{figure}1088\centerline{\includegraphics[width=3.5in]{figs/chap02-1.pdf}}1089\caption{A directed graph that represents a social network.}1090\label{chap02-1}1091\end{figure}10921093To most people a ``graph" is a visual representation of data, like1094a bar chart or a plot of stock prices over time. That's not what this1095chapter is about.10961097\index{graph}10981099In this chapter, a {\bf graph} is a representation of1100a system that contains discrete, interconnected elements. The1101elements are represented by {\bf nodes} --- also called {\bf vertices} --1102and the interconnections are represented by {\bf edges}.11031104\index{node}1105\index{edge}1106\index{vertex}11071108For example, you could represent a road map with a node for each1109city and an edge for each road between cities. Or you could1110represent a social network using a node for each person, with an1111edge between two people if they are friends.11121113\index{road network}1114\index{social network}11151116In some graphs, edges have attributes like length, cost, or weight.1117For example, in a road map, the length of an edge might represent1118distance between cities or travel time. In a1119social network there might be different kinds of edges to represent1120different kinds of relationships: friends, business associates, etc.11211122\index{edge weight}1123\index{weight}11241125Edges may be {\bf directed} or {\bf undirected}, depending on whether1126the relationships they represent are asymmetric or symmetric. In a1127road map, you might represent a one-way street with a directed edge1128and a two-way street with an undirected edge. In some social1129networks, like Facebook, friendship is symmetric: if $A$ is friends1130with $B$ then $B$ is friends with $A$. But on Twitter, for example,1131the ``follows'' relationship is not symmetric; if $A$ follows $B$,1132that doesn't imply that $B$ follows $A$. So you might use undirected1133edges to represent a Facebook network and directed edges for Twitter.11341135\index{directed graph}1136\index{undirected graph}11371138Graphs have interesting mathematical properties, and1139there is a branch of mathematics called {\bf graph theory}1140that studies them.11411142\index{graph theory}11431144Graphs are also useful, because there are many real world1145problems that can be solved using {\bf graph algorithms}.1146For example, Dijkstra's shortest path algorithm is an efficient1147way to find the shortest path from a node to all1148other nodes in a graph. A {\bf path} is a sequence of nodes1149with an edge between each consecutive pair.11501151\index{graph algorithm}1152\index{path}11531154Graphs are usually drawn with squares or circles for nodes and lines1155for edges. For example, the directed graph in Figure~\ref{chap02-1}1156might represent three people who follow each other on Twitter.1157The arrow indicates the direction of the relationship. In this example, Alice and Bob follow each other, both follow1158Chuck, and Chuck follows no one.11591160\index{representing graphs}11611162The undirected graph in Figure~\ref{chap02-2} shows four cities1163in the northeast United States; the labels on the edges1164indicate driving time in hours.1165In this example the placement of the nodes corresponds1166roughly to the geography of the cities, but in general the layout1167of a graph is arbitrary.11681169\index{graph layout}117011711172\section{NetworkX}11731174\begin{figure}1175\centerline{\includegraphics[width=3.5in]{figs/chap02-2.pdf}}1176\caption{An undirected graph that represents driving time between cities.}1177\label{chap02-2}1178\end{figure}11791180To represent graphs, we'll use a package called NetworkX,1181which is the most commonly used network library in Python.1182You can read more about it at \url{https://thinkcomplex.com/netx},1183but I'll explain it as we go along.11841185\index{NetworkX}11861187We can create a directed graph by importing NetworkX (usually imported as \py{nx}) and instantiating1188\py{nx.DiGraph}:11891190\begin{code}1191import networkx as nx1192G = nx.DiGraph()1193\end{code}11941195At this point, \py{G} is a \py{DiGraph} object that contains no nodes1196and no edges. We can add nodes using the \py{add_node} method:11971198\begin{code}1199G.add_node('Alice')1200G.add_node('Bob')1201G.add_node('Chuck')1202\end{code}12031204Now we can use the \py{nodes} method to get a list of nodes:12051206\begin{code}1207>>> list(G.nodes())1208NodeView(('Alice', 'Bob', 'Chuck'))1209\end{code}12101211The \py{nodes} method returns a \py{NodeView}, which can be used in a for loop or, as in this example, used to make a list.12121213Adding edges works pretty much the same way:12141215\begin{code}1216G.add_edge('Alice', 'Bob')1217G.add_edge('Alice', 'Chuck')1218G.add_edge('Bob', 'Alice')1219G.add_edge('Bob', 'Chuck')1220\end{code}12211222And we can use \py{edges} to get the list of edges:12231224\begin{code}1225>>> list(G.edges())1226[('Alice', 'Bob'), ('Alice', 'Chuck'),1227('Bob', 'Alice'), ('Bob', 'Chuck')]1228\end{code}12291230NetworkX provides several functions for drawing graphs;1231\py{draw_circular} arranges the nodes in a circle and connects them1232with edges:12331234\begin{code}1235nx.draw_circular(G,1236node_color=COLORS[0],1237node_size=2000,1238with_labels=True)1239\end{code}12401241That's the code I use to generate Figure~\ref{chap02-1}.1242The option \py{with_labels} causes the nodes to be labeled;1243in the next example we'll see how to label the edges.12441245\index{Graph}1246\index{node}1247\index{edge}12481249To generate Figure~\ref{chap02-2}, I start with a dictionary1250that maps from each city name to its approximate longitude1251and latitude:12521253\begin{code}1254positions = dict(Albany=(-74, 43),1255Boston=(-71, 42),1256NYC=(-74, 41),1257Philly=(-75, 40))1258\end{code}12591260Since this is an undirected graph, I instantiate \py{nx.Graph}:12611262\begin{code}1263G = nx.Graph()1264\end{code}12651266Then I can use \py{add_nodes_from} to iterate the keys of1267\py{positions} and add them as nodes:12681269\begin{code}1270G.add_nodes_from(positions)1271\end{code}12721273Next I'll make a dictionary that maps from each edge to the corresponding1274driving time:12751276\begin{code}1277drive_times = {('Albany', 'Boston'): 3,1278('Albany', 'NYC'): 4,1279('Boston', 'NYC'): 4,1280('NYC', 'Philly'): 2}1281\end{code}12821283Now I can use \py{add_edges_from}, which iterates the keys of1284\py{drive_times} and adds them as edges:12851286\begin{code}1287G.add_edges_from(drive_times)1288\end{code}12891290Instead of \py{draw_circular}, which arranges the nodes in1291a circle, I'll use \py{draw}, which takes the position dictionary as the second1292parameter:12931294\begin{code}1295nx.draw(G, positions,1296node_color=COLORS[1],1297node_shape='s',1298node_size=2500,1299with_labels=True)1300\end{code}13011302\py{draw} uses \py{positions} to determine the locations of the nodes.13031304To add the edge labels, we use \py{draw_networkx_edge_labels}:13051306\begin{code}1307nx.draw_networkx_edge_labels(G, positions,1308edge_labels=drive_times)1309\end{code}13101311The \py{edge_labels} parameter expects a dictionary that maps from each pair of nodes to a label; in this case, the labels are driving times between cities.1312And that's how I generated Figure~\ref{chap02-2}.13131314In both of these examples, the nodes are strings, but in general they1315can be any hashable type.13161317\index{hashable}131813191320\section{Random graphs}1321\label{randomgraphs}13221323A random graph is just what it sounds like: a graph with nodes and edges1324generated at random. Of course, there are many random processes that1325can generate graphs, so there are many kinds of random graphs.13261327\index{random graph}13281329One of the more interesting kinds is the \Erdos-\Renyi~model, studied1330by Paul \Erdos~and Alfr\'{e}d \Renyi~in the 1960s.13311332\index{Renyi, Alfred@\Renyi, Afr\'{e}d}1333\index{Erdos, Paul@\Erdos, Paul}13341335An \Erdos-\Renyi~graph (ER graph) is characterized by two parameters:1336$n$ is the number of nodes and $p$ is the probability that there1337is an edge between any two nodes.1338See \url{https://thinkcomplex.com/er}.13391340\index{Erdos-Renyi model@\Erdos-\Renyi~model}13411342\Erdos~and \Renyi~studied the properties of these random graphs;1343one of their surprising results is the existence of1344abrupt changes in the properties of random graphs as1345random edges are added.13461347\index{random edge}13481349One of the properties that displays this kind of transition is1350connectivity. An undirected graph is {\bf connected} if there is a1351path from every node to every other node.13521353\index{connected graph}13541355In an ER graph, the probability that the graph is connected is very1356low when $p$ is small and nearly 1 when $p$ is large. Between these1357two regimes, there is a rapid transition at a particular value of1358$p$, denoted $p^*$.13591360%TODO: check these formulas in the O'Reilly version13611362\Erdos~and \Renyi~showed that this critical value is1363$p^* = (\ln n) / n$, where $n$ is the number of nodes.1364A random graph, $G(n, p)$, is unlikely to be connected1365if $p < p^*$ and very likely to be connected if $p > p^*$.13661367\index{critical value}13681369To test this claim, we'll develop algorithms to generate random1370graphs and check whether they are connected.137113721373\section{Generating graphs}1374\label{generating}13751376\begin{figure}1377\centerline{\includegraphics[width=3.5in]{figs/chap02-3.pdf}}1378\caption{A complete graph with 10 nodes.}1379\label{chap02-3}1380\end{figure}13811382I'll start by generating a {\bf complete} graph, which is a graph1383where every node is connected to every other.13841385\index{complete graph}1386\index{generator function}13871388Here's a generator function that takes a list of nodes and enumerates1389all distinct pairs. If you are not familiar with generator functions,1390you can read about them at \url{https://thinkcomplex.com/gen}.13911392\begin{code}1393def all_pairs(nodes):1394for i, u in enumerate(nodes):1395for j, v in enumerate(nodes):1396if i>j:1397yield u, v1398\end{code}13991400We can use \py{all_pairs} to construct a complete graph:14011402\begin{code}1403def make_complete_graph(n):1404G = nx.Graph()1405nodes = range(n)1406G.add_nodes_from(nodes)1407G.add_edges_from(all_pairs(nodes))1408return G1409\end{code}14101411\py{make_complete_graph} takes the number of nodes, \py{n}, and1412returns a new \py{Graph} with \py{n} nodes and edges between all1413pairs of nodes.14141415The following code makes a complete graph with 10 nodes and draws it:14161417\begin{code}1418complete = make_complete_graph(10)1419nx.draw_circular(complete,1420node_color=COLORS[2],1421node_size=1000,1422with_labels=True)1423\end{code}14241425Figure~\ref{chap02-3} shows the result.1426Soon we will modify this code to generate ER graphs, but first1427we'll develop functions to check whether a graph is connected.142814291430\section{Connected graphs}1431\label{connected}14321433A graph is {\bf connected} if there is a path from every node to every1434other node (see \url{https://thinkcomplex.com/conn}).1435\index{connected graph}1436\index{path}14371438For many applications involving graphs, it is useful to check whether a graph is connected. Fortunately, there is a simple algorithm that does it.14391440You can start at any node and check whether you can reach all1441other nodes. If you can reach a node, $v$, you can reach any1442of the {\bf neighbors} of $v$, which are the nodes connected to1443$v$ by an edge.14441445\index{neighbor node}14461447The \py{Graph} class provides a method called \py{neighbors}1448that returns a list of neighbors for a given node. For1449example, in the complete graph we generated in the previous section:14501451\begin{code}1452>>> complete.neighbors(0)1453[1, 2, 3, 4, 5, 6, 7, 8, 9]1454\end{code}14551456Suppose we start at node $s$. We can mark $s$ as ``seen'' and mark its neighbors. Then we mark the neighbor's neighbors, and their neighbors, and so1457on, until we can't reach any more nodes. If all nodes are1458seen, the graph is connected.14591460\index{reachable node}14611462Here's what that looks like in Python:14631464\begin{code}1465def reachable_nodes(G, start):1466seen = set()1467stack = [start]1468while stack:1469node = stack.pop()1470if node not in seen:1471seen.add(node)1472stack.extend(G.neighbors(node))1473return seen1474\end{code}14751476\py{reachable_nodes} takes a \py{Graph} and a starting node, {\tt1477start}, and returns the set of nodes that can be reached from {\tt1478start}.14791480\index{set}1481\index{stack}14821483Initially the set, \py{seen}, is empty, and we create a1484list called \py{stack} that keeps track of nodes we have1485discovered but not yet processed. Initially the stack contains1486a single node, \py{start}.14871488\index{set}14891490Now, each time through the loop, we:14911492\begin{enumerate}14931494\item Remove one node from the stack.14951496\item If the node is already in \py{seen}, we go back1497to Step 1.14981499\item Otherwise, we add the node to \py{seen} and add its1500neighbors to the stack.15011502\end{enumerate}15031504When the stack is empty, we can't reach any more nodes, so we1505break out of the loop and return \py{seen}.15061507\index{stack}15081509As an example, we can find all nodes in the complete graph that1510are reachable from node 0:15111512\begin{code}1513>>> reachable_nodes(complete, 0)1514{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}1515\end{code}15161517Initially, the stack contains node 0 and \py{seen} is empty.1518The first time through the loop, node 0 is added to \py{seen}1519and all the other nodes are added to the stack (since they are all1520neighbors of node 0).15211522The next time through the loop, \py{pop} returns the last element1523in the stack, which is node 9. So node 9 gets added to \py{seen}1524and its neighbors get added to the stack.15251526Notice that the same node can appear more than once in the stack;1527in fact, a node with $k$ neighbors will be added to the stack1528$k$ times. Later, we will look for ways to make this algorithm1529more efficient.15301531We can use \py{reachable_nodes} to write \py{is_connected}:15321533\begin{code}1534def is_connected(G):1535start = next(iter(G))1536reachable = reachable_nodes(G, start)1537return len(reachable) == len(G)1538\end{code}15391540\py{is_connected} chooses a starting node by making a node iterator and choosing the first element.1541Then it uses \py{reachable} to get the set of nodes that can be reached from1542\py{start}. If the size of this set is the same as the size1543of the graph, that means we can reach all nodes, which means the1544graph is connected.15451546A complete graph is, not surprisingly, connected:15471548\begin{code}1549>>> is_connected(complete)1550True1551\end{code}15521553In the next section we will generate ER graphs and check whether they1554are connected.155515561557\section{Generating ER graphs}1558\label{flip}15591560\begin{figure}1561\centerline{\includegraphics[width=3.5in]{figs/chap02-4.pdf}}1562\caption{An ER graph with \py{n=10} and \py{p=0.3}.}1563\label{chap02-4}1564\end{figure}15651566The ER graph $G(n, p)$ contains $n$ nodes, and each pair of nodes is1567connected by an edge with probability $p$. Generating an ER graph is1568similar to generating a complete graph.15691570\index{flip}1571\index{generator function}15721573The following generator function enumerates all possible edges and1574chooses which ones should be added to the graph:15751576\begin{code}1577def random_pairs(nodes, p):1578for edge in all_pairs(nodes):1579if flip(p):1580yield edge1581\end{code}15821583\py{random_pairs} uses \py{flip}:15841585\begin{code}1586def flip(p):1587return np.random.random() < p1588\end{code}15891590This is the first example we're seen that uses NumPy. Following convention, I1591import \py{numpy} as \py{np}. NumPy provides a module named \py{random}, which provides a method named \py{random}, which returns a number between 0 and 1, uniformly distributed.15921593So \py{flip} returns \py{True} with the1594given probability, \py{p}, and \py{False} with the complementary1595probability, \py{1-p}.15961597\index{NumPy}15981599Finally, \py{make_random_graph} generates and returns the ER graph $G(n, p)$:16001601\begin{code}1602def make_random_graph(n, p):1603G = nx.Graph()1604nodes = range(n)1605G.add_nodes_from(nodes)1606G.add_edges_from(random_pairs(nodes, p))1607return G1608\end{code}16091610\py{make_random_graph} is almost identical to \py{make_complete_graph};1611the only difference is that it uses \py{random_pairs} instead of1612\py{all_pairs}.16131614Here's an example with \py{p=0.3}:16151616\begin{code}1617random_graph = make_random_graph(10, 0.3)1618\end{code}16191620Figure~\ref{chap02-4} shows the result. This graph turns out to be1621connected; in fact, most ER graphs with $n=10$ and1622$p=0.3$ are connected. In the next section, we'll see how many.1623162416251626\section{Probability of connectivity}16271628\begin{figure}1629\centerline{\includegraphics[width=3.5in]{figs/chap02-5.pdf}}1630\caption{Probability of connectivity with $n=10$ and a range of $p$.1631The vertical line shows the predicted critical value.}1632\label{chap02-5}1633\end{figure}16341635\begin{figure}1636\centerline{\includegraphics[width=3.5in]{figs/chap02-6.pdf}}1637\caption{Probability of connectivity for several values of $n$ and a range of $p$.}1638\label{chap02-6}1639\end{figure}16401641For given values of $n$ and $p$, we would like to know the probability1642that $G(n, p)$ is connected. We can estimate it by generating1643a large number of random graphs and counting how many are connected.1644Here's how:16451646\begin{code}1647def prob_connected(n, p, iters=100):1648tf = [is_connected(make_random_graph(n, p))1649for i in range(iters)]1650return np.mean(bool)1651\end{code}16521653The parameters \py{n} and \py{p} are passed along to \py{make_random_graph};1654\py{iters} is the number of random graphs we generate.16551656This function uses a list comprehension; if you are not familiar with this feature, you can read about it at \url{https://thinkcomplex.com/comp}.16571658\index{list comprehension}16591660The result, \py{tf}, is a list of boolean values: \py{True} for each graph that's connected and \py{False} for each one that's not.16611662\index{boolean}16631664\py{np.mean} is a NumPy function that computes the mean of this list, treating1665\py{True} as 1 and \py{False} as 0. The result is the fraction of random graphs that are connected.16661667\index{NumPy}1668\index{mean}1669\index{probability of connectivity}16701671\begin{code}1672>>> prob_connected(10, 0.23, iters=10000)16730.331674\end{code}16751676I chose $0.23$ because it is close to the critical value where the probability of connectivity goes from near 0 to near 1. According to \Erdos~and \Renyi, $p^* = \ln n / n = 0.23$.16771678\index{critical value}16791680We can get a clearer view of the transition by estimating the probability1681of connectivity for a range of values of $p$:16821683\begin{code}1684n = 101685ps = np.logspace(-2.5, 0, 11)1686ys = [prob_connected(n, p) for p in ps]1687\end{code}16881689The NumPy function \py{logspace} returns an1690array of 11 values from $10^{-2.5}$ to $10^0 = 1$, equally spaced1691on a logarithmic scale.16921693\index{NumPy}1694\index{logarithm}1695\index{logspace}16961697For each value of \py{p} in the array, we compute the probability that a graph with parameter \py{p} is connected and store the results in \py{ys}.16981699Figure~\ref{chap02-5} shows the results, with1700a vertical line at the computed critical value, $p^* = 0.23$.1701As expected, the transition from 0 to 11702occurs near the critical value.17031704Figure~\ref{chap02-6} shows1705similar results for larger values of $n$. As $n$ increases, the1706critical value gets smaller and the transition gets more abrupt.17071708These experimental results are consistent with the analytic results \Erdos~and1709\Renyi~presented in their papers.171017111712\section{Analysis of graph algorithms}1713\label{graphanalysis}17141715Earlier in this chapter I presented an algorithm for checking whether1716a graph is connected; in the next few chapters, we will see1717other graph algorithms. Along the way, we will analyze the1718performance of those algorithms, figuring out how their run times1719grow as the size of the graphs increases.17201721\index{analysis of algorithms}17221723If you are not already familiar with analysis of algorithms,1724you might want to read Appendix B of {\it Think Python, 2nd Edition},1725at \url{https://thinkcomplex.com/tp2}.17261727\newcommand{\V}{n}1728\newcommand{\E}{m}17291730The order of growth for graph algorithms is usually expressed1731as a function of $\V$, the number of vertices (nodes), and $\E$, the number1732of edges.17331734\index{graph algorithm}17351736As an example, let's analyze \py{reachable_nodes} from1737Section~\ref{connected}:17381739\begin{code}1740def reachable_nodes(G, start):1741seen = set()1742stack = [start]1743while stack:1744node = stack.pop()1745if node not in seen:1746seen.add(node)1747stack.extend(G.neighbors(node))1748return seen1749\end{code}17501751Each time through the loop, we pop a node off the stack; by default,1752\py{pop} removes and returns the last element of a list, which is1753a constant time operation.17541755\index{constant time}17561757Next we check whether the node is in \py{seen}, which is a set,1758so checking membership is constant time.17591760If the node is not already in \py{seen}, we add it, which is1761constant time, and then add the neighbors to the stack, which is1762linear in the number of neighbors.17631764\index{linear time}17651766To express the run time in terms of $\V$ and $\E$, we can add up1767the total number of times each node is added to \py{seen}1768and \py{stack}.17691770Each node is only added to \py{seen} once, so the total number1771of additions is $\V$.17721773But nodes might be added to \py{stack} many times, depending on1774how many neighbors they have. If a node has $k$ neighbors, it1775is added to \py{stack} $k$ times. Of course, if it has $k$ neighbors,1776that means it is connected to $k$ edges.17771778So the total number1779of additions to \py{stack} is the total number of edges, $\E$,1780doubled because we consider every edge twice.17811782\index{order of growth}17831784Therefore, the order of growth for this function is $O(\V + \E)$,1785which is a convenient way to say that the run time grows in proportion1786to either $\V$ or $\E$, whichever is bigger.17871788\index{breadth-first search}1789\index{BFS}17901791If we know the relationship between $\V$ and $\E$, we can simplify1792this expression. For example, in a complete graph the number of edges1793is $n(n-1)/2$, which is in $O(\V^2)$. So for a complete graph,1794\py{reachable_nodes} is quadratic in $\V$.1795\index{quadratic}179617971798\section{Exercises}17991800The code for this chapter is in \py{chap02.ipynb}, which is a1801Jupyter notebook in the repository for this book. For more information1802about working with this code, see Section~\ref{code}.18031804\begin{exercise}1805Launch \py{chap02.ipynb} and run the code. There are a few short1806exercises embedded in the notebook that you might want to try.1807\end{exercise}18081809\begin{exercise}1810In Section~\ref{graphanalysis} we analyzed the performance of1811\py{reachable_nodes} and classified it in $O(n + m)$, where $n$ is the1812number of nodes and $m$ is the number of edges. Continuing the1813analysis, what is the order of growth for \py{is_connected}?18141815\begin{code}1816def is_connected(G):1817start = list(G)[0]1818reachable = reachable_nodes(G, start)1819return len(reachable) == len(G)1820\end{code}18211822\end{exercise}18231824\begin{exercise}1825In my implementation of \py{reachable_nodes}, you might be bothered by1826the apparent inefficiency of adding {\em all} neighbors to the stack1827without checking whether they are already in \py{seen}. Write a1828version of this function that checks the neighbors before adding them1829to the stack. Does this ``optimization'' change the order of growth?1830Does it make the function faster?1831\end{exercise}183218331834\begin{exercise}18351836There are actually two kinds of ER graphs. The one we generated in1837this chapter, $G(n, p)$, is characterized by two parameters, the number1838of nodes and the probability of an edge between nodes.18391840\index{Erdos-Renyi model@\Erdos-\Renyi~model}18411842An alternative definition, denoted $G(n, m)$, is also characterized by1843two parameters: the number of nodes, $n$, and the number of edges,1844$m$. Under this definition, the number of edges is fixed, but their1845location is random.18461847Repeat the experiments we did in this chapter using this alternative1848definition. Here are a few suggestions for how to proceed:184918501. Write a function called \py{m_pairs} that takes a list of nodes1851and the number of edges, $m$, and returns a random selection of $m$1852edges. A simple way to do that is to generate a list of all possible1853edges and use \py{random.sample}.185418552. Write a function called \py{make_m_graph} that takes $n$ and1856$m$ and returns a random graph with $n$ nodes and $m$ edges.185718583. Make a version of \py{prob_connected} that uses1859\py{make_m_graph} instead of \py{make_random_graph}.186018614. Compute the probability of connectivity for a range of values of $m$.18621863How do the results of this experiment compare to the results using the1864first type of ER graph?18651866\end{exercise}186718681869\chapter{Small World Graphs}18701871Many networks in the real world, including social networks, have1872the ``small world property'', which is that the average distance1873between nodes, measured in number of edges on the shortest path,1874is much smaller than expected.18751876In this chapter, I present Stanley Milgram's famous Small World1877Experiment, which was the first demonstration of1878the small world property in a real social network. Then we'll1879consider Watts-Strogatz graphs, which are intended as a model of1880small world graphs. I'll replicate the experiment Watts and Strogatz performed and explain what it is intended to show.18811882Along the way, we'll see two new graph algorithms: breadth-first1883search (BFS) and Dijkstra's algorithm for computing the shortest1884path between nodes in a graph.18851886The code for this chapter is in {\tt chap03.ipynb} in the repository1887for this book. More information about working with the code is1888in Section~\ref{code}.18891890\section{Stanley Milgram}18911892Stanley Milgram was an American social psychologist who conducted1893two of the most famous experiments in social science, the1894Milgram experiment, which studied people's obedience to authority1895(\url{https://thinkcomplex.com/milgram})1896and the Small World Experiment, which studied1897the structure of social networks1898(\url{https://thinkcomplex.com/small}).18991900\index{Milgram, Stanley}1901\index{small world experiment}19021903In the Small World Experiment, Milgram sent a package to several1904randomly-chosen people in Wichita, Kansas, with instructions asking1905them to forward an enclosed letter to a target person, identified by1906name and occupation, in Sharon, Massachusetts (which happens to be the town near1907Boston where I grew up). The subjects were told that they could mail1908the letter directly to the target person only if they knew him1909personally; otherwise they were instructed to send it, and the same1910instructions, to a relative or friend they thought would be more1911likely to know the target person.19121913\index{Kansas}1914\index{Wichita, Kansas}1915\index{Massachusetts}1916\index{Sharon, Massachusetts}19171918Many of the letters were never delivered, but for the ones that1919were the average path length --- the number of1920times the letters were forwarded --- was about six. This result1921was taken to confirm previous observations (and speculations) that1922the typical distance between any two people in a social network1923is about ``six degrees of separation''.19241925\index{six degrees}19261927This conclusion is surprising because most people expect social1928networks to be localized --- people tend to live near their1929friends --- and in a graph with local connections, path lengths tend to1930increase in proportion to geographical distance. For example, most of1931my friends live nearby, so I would guess that the average distance1932between nodes in a social network is about 50 miles. Wichita is about19331600 miles from Boston, so if Milgram's letters traversed typical1934links in the social network, they should have taken 32 hops, not 6.19351936\index{hop}1937\index{social network}1938\index{local connection}193919401941\section{Watts and Strogatz}1942\label{watts}19431944In 1998 Duncan Watts and Steven Strogatz published a paper in {\em1945Nature}, ``Collective dynamics of `small-world' networks'', that1946proposed an explanation for the small world phenomenon. You can1947download it from1948\url{https://thinkcomplex.com/watts}.19491950\index{Watts, Duncan}1951\index{Strogatz, Steven}1952\index{small world network}19531954Watts and Strogatz start with two kinds of graph that were well1955understood: random graphs and regular graphs. In a random graph, nodes1956are connected at random. In a regular graph, every node has the1957same number of neighbors.1958They consider two1959properties of these graphs, clustering and path length:19601961\index{random graph}1962\index{regular graph}1963\index{clustering}1964\index{path length}19651966\begin{itemize}19671968\item Clustering is a measure of the ``cliquishness'' of the graph.1969In a graph, a {\bf clique} is a subset of nodes that are1970all connected to each other; in a social network, a clique is1971a set of people who are all friends with each other. Watts and Strogatz1972defined a clustering coefficient that quantifies the likelihood1973that two nodes that are connected to the same node are also1974connected to each other.19751976\index{clique}19771978\item Path length is a measure of the average distance between1979two nodes, which corresponds to the degrees of separation in1980a social network.19811982\end{itemize}19831984Watts and Strogatz show that regular graphs1985have high clustering and high path lengths, whereas1986random graphs with the same size usually have low clustering1987and low path lengths. So neither of these is a good model of1988social networks, which combine high clustering with1989short path lengths.19901991Their goal was to create a {\bf generative model} of a social1992network. A generative model tries to explain a phenomenon by1993modeling the process that builds or leads to the phenomenon.1994Watts and Strogatz proposed this process for building1995small-world graphs:1996\index{generative model}19971998\begin{enumerate}19992000\item Start with a regular graph with $n$ nodes and each node2001connected to $k$ neighbors.20022003\item Choose a subset of the edges and ``rewire'' them by2004replacing them with random edges.2005\index{rewire}20062007\end{enumerate}20082009The probability that an edge is rewired is a parameter, $p$,2010that controls how random the graph is. With $p=0$, the graph2011is regular; with $p=1$ it is completely random.2012\index{parameter}20132014Watts and Strogatz found that small values of $p$ yield graphs2015with high clustering, like a regular graph, and low path2016lengths, like a random graph.20172018In this chapter I replicate the Watts and Strogatz experiment2019in the following steps:20202021\begin{enumerate}20222023\item We'll start by constructing a ring lattice, which is a kind2024of regular graph.20252026\item Then we'll rewire it as Watts and Strogatz did.20272028\item We'll write a function to measure the degree of clustering2029and use a NetworkX function to compute path lengths.20302031\item Then we'll compute the degree of clustering and path length for2032a range of values of $p$.20332034\item Finally, I'll present Dijkstra's algorithm, which computes shortest paths efficiently.20352036\end{enumerate}203720382039\section{Ring lattice}20402041\begin{figure}2042\centerline{\includegraphics[width=3.5in]{figs/chap03-1.pdf}}2043\caption{A ring lattice with $n=10$ and $k=4$.}2044\label{chap03-1}2045\end{figure}20462047A {\bf regular} graph is a graph where each node has the same number2048of neighbors; the number of neighbors is also called the {\bf degree}2049of the node.20502051\index{regular graph}2052\index{ring lattice}20532054A ring lattice is a kind of regular graph, which Watts and Strogatz2055use as the basis of their model.2056In a ring lattice with $n$ nodes, the nodes can be arranged in a circle2057with each node connected to the $k$ nearest neighbors.20582059For example, a ring lattice with $n=3$2060and $k=2$ would contain the following edges: $(0, 1)$, $(1, 2)$, and2061$(2, 0)$. Notice that the edges ``wrap around'' from the2062highest-numbered node back to 0.20632064More generally, we can enumerate the edges like this:20652066\begin{code}2067def adjacent_edges(nodes, halfk):2068n = len(nodes)2069for i, u in enumerate(nodes):2070for j in range(i+1, i+halfk+1):2071v = nodes[j % n]2072yield u, v2073\end{code}20742075\py{adjacent_edges} takes a list of nodes and a parameter,2076\py{halfk}, which is half of $k$. It is a generator function that2077yields one edge at a time. It uses the modulus operator, \verb"%",2078to wrap around from the highest-numbered node to the lowest.20792080\index{generator function}20812082We can test it like this:20832084\begin{code}2085>>> nodes = range(3)2086>>> for edge in adjacent_edges(nodes, 1):2087... print(edge)2088(0, 1)2089(1, 2)2090(2, 0)2091\end{code}20922093Now we can use \py{adjacent_edges} to make a ring lattice:20942095\begin{code}2096def make_ring_lattice(n, k):2097G = nx.Graph()2098nodes = range(n)2099G.add_nodes_from(nodes)2100G.add_edges_from(adjacent_edges(nodes, k//2))2101return G2102\end{code}21032104Notice that \py{make_ring_lattice} uses floor division to compute2105\py{halfk}, so it is only correct if \py{k} is even. If2106\py{k} is odd, floor division rounds down, so the result is2107a ring lattice with degree \py{k-1}. As one of the exercises at the2108end of the chapter, you will generate regular graphs with odd2109values of \py{k}.21102111\index{floor division}2112\index{degree}21132114We can test \py{make_ring_lattice} like this:21152116\begin{code}2117lattice = make_ring_lattice(10, 4)2118\end{code}21192120Figure~\ref{chap03-1} shows the result.212121222123\section{WS graphs}21242125\begin{figure}2126\centerline{\includegraphics[width=3.5in]{figs/chap03-2.pdf}}2127\caption{WS graphs with $n=20$, $k=4$, and $p=0$ (left), $p=0.2$ (middle),2128and $p=1$ (right).}2129\label{chap03-2}2130\end{figure}21312132To make a Watts-Strogatz (WS) graph, we start with a ring lattice and2133``rewire'' some of the edges. In their paper, Watts and Strogatz2134consider the edges in a particular order and rewire each one with2135probability $p$. If an edge is rewired, they leave the first node2136unchanged and choose the second node at random. They don't allow self2137loops or multiple edges; that is, you can't have a edge from a node to2138itself, and you can't have more than one edge between the same two2139nodes.21402141\index{Watts-Strogatz graph}2142\index{rewire}21432144Here is my implementation of this process.21452146\begin{code}2147def rewire(G, p):2148nodes = set(G)2149for u, v in G.edges():2150if flip(p):2151choices = nodes - {u} - set(G[u])2152new_v = np.random.choice(list(choices))2153G.remove_edge(u, v)2154G.add_edge(u, new_v)2155\end{code}21562157The parameter \py{p} is the probability of rewiring an edge. The2158\py{for} loop enumerates the edges and uses \py{flip} (defined in Section~\ref{flip}) to choose which ones get rewired.21592160If we are rewiring an edge from node \py{u} to node \py{v}, we2161have to choose a replacement for \py{v}, called \py{new_v}.21622163\begin{enumerate}21642165\item To compute the possible choices,2166we start with \py{nodes}, which is a set,2167and subtract off \py{u} and its neighbors, which avoids self2168loops and multiple edges.21692170\item To choose \py{new_v}, we use the NumPy2171function \py{choice}, which is in the module \py{random}.21722173\index{NumPy}2174\index{random}2175\index{choice}21762177\item Then we remove the existing edge from \py{u} to \py{v}, and21782179\item Add a new edge from \py{u} to \py{new_v}.21802181\end{enumerate}21822183As an aside, the expression \py{G[u]} returns a dictionary that contains the neighbors of \py{u} as keys. It is usually faster than using \py{G.neighbors}2184(see \url{https://thinkcomplex.com/neigh}).21852186This function does not consider the edges in the order specified2187by Watts and Strogatz, but that doesn't seem to affect2188the results.21892190Figure~\ref{chap03-2} shows WS graphs with $n=20$, $k=4$, and2191a range of values of $p$. When $p=0$, the graph is a ring lattice.2192When $p=1$, it is completely random. As we'll see, the interesting2193things happen in between.219421952196\section{Clustering}2197\label{clustering}21982199The next step is to compute the clustering coefficient, which2200quantifies the tendency for the nodes to form cliques.2201A {\bf clique} is a set of nodes that are completely connected;2202that is, there are edges between all pairs of nodes in the set.22032204\index{clustering coefficient}2205\index{network average clustering coefficient}22062207Suppose a particular node, $u$, has $k$ neighbors. If all of the2208neighbors are connected to each other, there would be $k(k-1)/2$2209edges among them. The fraction of those edges that actually exist2210is the local clustering coefficient for $u$, denoted $C_u$.22112212If we compute the average of $C_u$ over all nodes, we get the2213``network average clustering coefficient'', denoted $\bar{C}$.22142215Here is a function that computes it.22162217\begin{code}2218def node_clustering(G, u):2219neighbors = G[u]2220k = len(neighbors)2221if k < 2:2222return np.nan22232224possible = k * (k-1) / 22225exist = 02226for v, w in all_pairs(neighbors):2227if G.has_edge(v, w):2228exist +=12229return exist / possible2230\end{code}22312232Again I use \py{G[u]}, which returns a dictionary with the neighbors of \py{u} as keys.22332234If a node has fewer than 2 neighbors, the clustering coefficient is undefined, so we return \py{np.nan}, which is a special value that indicates ``Not a Number''.22352236\index{NaN}2237\index{Not a Number}22382239Otherwise we compute the number of possible edges among the neighbors, count the number of those edges that actually exist, and return the fraction that exist.22402241We can test the function like this:22422243\begin{code}2244>>> lattice = make_ring_lattice(10, 4)2245>>> node_clustering(lattice, 1)22460.52247\end{code}22482249In a ring lattice with $k=4$, the clustering coefficient for each node2250is 0.5 (if you are not convinced, take another look at2251Figure~\ref{chap03-1}).22522253Now we can compute the network average clustering coefficient like this:22542255\begin{code}2256def clustering_coefficient(G):2257cu = [node_clustering(G, node) for node in G]2258return np.nanmean(cu)2259\end{code}22602261The NumPy function \py{nanmean} computes the mean of the local clustering coefficients, ignoring any values that are \py{NaN}.22622263\index{NumPy}2264\index{nanmean}22652266We can test \py{clustering_coefficient} like this:22672268\begin{code}2269>>> clustering_coefficient(lattice)22700.52271\end{code}22722273In this graph, the local clustering coefficient for all nodes is 0.5,2274so the average across nodes is 0.5. Of course, we expect this value2275to be different for WS graphs.227622772278\section{Shortest path lengths}2279\label{pathlength}22802281The next step is to compute the characteristic path length, $L$, which2282is the average length of the shortest path between each pair of nodes.2283To compute it, I'll start with a function provided by NetworkX,2284\py{shortest_path_length}. I'll use it to replicate the Watts and2285Strogatz experiment, then I'll explain how it works.22862287\index{characteristic path length}2288\index{path length}2289\index{shortest path}22902291Here's a function that takes a graph and returns a list of shortest2292path lengths, one for each pair of nodes.22932294\begin{code}2295def path_lengths(G):2296length_map = nx.shortest_path_length(G)2297lengths = [length_map[u][v] for u, v in all_pairs(G)]2298return lengths2299\end{code}23002301The return value from \py{nx.shortest_path_length} is a dictionary2302of dictionaries. The outer dictionary maps from each node, \py{u},2303to a dictionary that maps from each node, \py{v}, to the length of2304the shortest path from \py{u} to \py{v}.23052306With the list of lengths from \py{path_lengths}, we can compute $L$2307like this:23082309\begin{code}2310def characteristic_path_length(G):2311return np.mean(path_lengths(G))2312\end{code}23132314And we can test it with a small ring lattice:23152316\begin{code}2317>>> lattice = make_ring_lattice(3, 2)2318>>> characteristic_path_length(lattice)23191.02320\end{code}23212322In this example, all 3 nodes are connected to each other, so the2323mean path length is 1.232423252326\section{The WS experiment}23272328\begin{figure}2329\centerline{\includegraphics[width=3.5in]{figs/chap03-3.pdf}}2330\caption{Clustering coefficient (C) and characteristic path length (L) for2331WS graphs with $n=1000$, $k=10$, and a range of $p$.}2332\label{chap03-3}2333\end{figure}23342335Now we are ready to replicate the WS experiment, which shows that for2336a range of values of $p$, a WS graph has high clustering like a2337regular graph and short path lengths like a random graph.23382339I'll start with \py{run_one_graph}, which takes \py{n}, \py{k}, and \py{p}; it2340generates a WS graph with the given parameters and computes the2341mean path length, \py{mpl}, and clustering coefficient, \py{cc}:23422343\begin{code}2344def run_one_graph(n, k, p):2345ws = make_ws_graph(n, k, p)2346mpl = characteristic_path_length(ws)2347cc = clustering_coefficient(ws)2348return mpl, cc2349\end{code}23502351Watts and Strogatz ran their experiment with \py{n=1000} and \py{k=10}.2352With these parameters, \py{run_one_graph} takes a few seconds on2353my computer; most of that time is spent computing the mean path length.23542355\index{NumPy}2356\index{logspace}23572358Now we need to compute these values for a range of \py{p}. I'll use2359the NumPy function \py{logspace} again to compute \py{ps}:23602361\begin{code}2362ps = np.logspace(-4, 0, 9)2363\end{code}23642365Here's the function that runs the experiment:23662367\begin{code}2368def run_experiment(ps, n=1000, k=10, iters=20):2369res = []2370for p in ps:2371t = [run_one_graph(n, k, p) for _ in range(iters)]2372means = np.array(t).mean(axis=0)2373res.append(means)2374return np.array(res)2375\end{code}23762377For each value of \py{p}, we generate 20 random graphs and average the2378results. Since the return value from \py{run_one_graph} is a pair, \py{t} is a list of pairs. When we convert it to an array, we get one row for each iteration and columns for \py{L} and \py{C}. Calling \py{mean} with the option \py{axis=0} computes the mean of each column; the result is an array with one row and two columns.23792380\index{NumPy}2381\index{mean}23822383When the loop exits, \py{means} is a list of pairs, which we convert to a NumPy array with one row for each value of \py{p} and columns for \py{L} and \py{C}.23842385We can extract the columns like this:23862387\begin{code}2388L, C = np.transpose(res)2389\end{code}23902391In order to plot \py{L} and \py{C} on the same axes, we standardize them2392by dividing through by the first element:23932394\begin{code}2395L /= L[0]2396C /= C[0]2397\end{code}23982399Figure~\ref{chap03-3} shows the results. As $p$ increases, the mean2400path length drops quickly, because even a small number of randomly2401rewired edges provide shortcuts between regions of the graph that2402are far apart in the lattice. On the other hand, removing local links2403decreases the clustering coefficient much more slowly.24042405As a result, there is a wide range of $p$ where a WS graph has the2406properties of a small world graph, high clustering and low path2407lengths.24082409And that's why Watts and Strogatz propose WS graphs as a2410model for real-world networks that exhibit the small world phenomenon.241124122413\section{What kind of explanation is {\em that}?}24142415If you ask me why planetary orbits are elliptical,2416I might start by modeling a planet and a star as point masses; I2417would look up the law of universal gravitation at2418\url{https://thinkcomplex.com/grav}2419and use it to write a differential equation for the motion of2420the planet. Then I would either derive the orbit equation or,2421more likely, look it up at \url{https://thinkcomplex.com/orbit}.2422With a little algebra, I could derive the conditions that2423yield an elliptical orbit. Then I would argue that the objects2424we consider planets satisfy these conditions.24252426\index{orbit}2427\index{ellipse}2428\index{planetary motion}2429\index{universal gravitation}24302431People, or at least scientists, are generally satisfied with2432this kind of explanation. One of the reasons for its appeal2433is that the assumptions and approximations in the model seem2434reasonable. Planets and stars are not really point masses,2435but the distances between them are so big that their actual2436sizes are negligible. Planets in the same solar system can2437affect each other's orbits, but the effect is usually small.2438And we ignore relativistic effects, again on the assumption that2439they are small.24402441\index{explanatory model}24422443This explanation is also appealing because it is equation-based.2444We can express the orbit equation in a closed form, which means2445that we can compute orbits efficiently. It also means that2446we can derive general expressions for the orbital velocity,2447orbital period, and other quantities.24482449\index{equation-based model}24502451Finally, I think this kind of explanation is appealing because2452it has the form of a mathematical proof.2453It is important to remember that the proof pertains to the2454model and not the real world.2455That is, we can prove that2456an idealized model yields elliptical orbits, but2457we can't prove that real orbits are ellipses (in2458fact, they are not).2459Nevertheless, the resemblance to a proof is appealing.24602461\index{mathematical proof}2462\index{proof}24632464By comparison, Watts and Strogatz's explanation of the small2465world phenomenon may seem less satisfying. First, the model2466is more abstract, which is to say less realistic. Second,2467the results are generated by simulation, not by mathematical2468analysis. Finally, the results seem less like a proof and2469more like an example.24702471\index{abstract model}24722473Many of the models in this book are like the Watts and Strogatz model:2474abstract, simulation-based and (at least superficially) less formal2475than conventional mathematical models. One of the goals of this book2476is to consider the questions these models raise:24772478\index{simulation-based model}24792480\begin{itemize}24812482\item What kind of work can these models do: are they predictive, or2483explanatory, or both?24842485\item Are the explanations these models offer less satisfying than2486explanations based on more traditional models? Why?24872488\item How should we characterize the differences between these and2489more conventional models? Are they different in kind or only in2490degree?24912492\end{itemize}24932494Over the course of the book I will offer my answers2495to these questions, but they are tentative and sometimes2496speculative. I encourage you to consider them skeptically2497and reach your own conclusions.249824992500\section{Breadth-First Search}2501\label{bfs}25022503When we computed shortest paths, we used a function provided by2504NetworkX, but I have not explained how it works. To do that, I'll2505start with breadth-first search, which is the basis of Dijkstra's2506algorithm for computing shortest paths.25072508\index{NetworkX}25092510In Section~\ref{connected} I presented \py{reachable_nodes}, which finds all2511the nodes that can be reached from a given starting node:25122513\begin{code}2514def reachable_nodes(G, start):2515seen = set()2516stack = [start]2517while stack:2518node = stack.pop()2519if node not in seen:2520seen.add(node)2521stack.extend(G.neighbors(node))2522return seen2523\end{code}25242525I didn't say so at the time, but \py{reachable_nodes} performs a2526depth-first search (DFS). Now we'll modify it to perform breadth-first2527search (BFS).25282529\index{depth-first search}2530\index{breadth-first search}2531\index{DFS}2532\index{BFS}25332534To understand the difference, imagine you are exploring a castle.2535You start in a room with three doors marked A, B, and C. You open2536door C and discover another room, with doors marked D, E, and F.25372538Which door do you open next? If you are feeling adventurous, you might2539want to go deeper into the castle and choose D, E, or F. That would be2540a depth-first search.25412542But if you wanted to be more systematic, you might go back and explore2543A and B before D, E, and F. That would be a breadth-first search.25442545In \py{reachable_nodes}, we use the list method \py{pop} to choose the next node to ``explore''. By default, \py{pop} returns the last element of the list, which is the last one we added. In the example, that would be door F.25462547\index{pop}25482549If we want to perform a BFS instead, the simplest solution is to2550pop the first element of the list:25512552\begin{code}2553node = stack.pop(0)2554\end{code}25552556That works, but it is slow. In Python, popping the last element2557of a list takes constant time, but popping the first element is linear2558in the length of the list. In the worst case, the length of the2559stack is $O(\V)$, which makes this implementation of BFS $O(\V\E)$,2560which is much worse than what it should be, $O(\V + \E)$.25612562\index{stack}2563\index{queue}2564\index{deque}2565\index{double-ended queue}25662567We can solve this problem with a double-ended queue, also known2568as a {\bf deque}. The important feature of a deque is that you2569can add and remove elements from the beginning or end in constant time.2570To see how it is implemented, see \url{https://thinkcomplex.com/deque}.25712572\index{collections module}25732574Python provides a \py{deque} in the \py{collections} module, so we can2575import it like this:25762577\begin{code}2578from collections import deque2579\end{code}25802581And we can use it to write an efficient BFS:25822583\begin{code}2584def reachable_nodes_bfs(G, start):2585seen = set()2586queue = deque([start])2587while queue:2588node = queue.popleft()2589if node not in seen:2590seen.add(node)2591queue.extend(G.neighbors(node))2592return seen2593\end{code}25942595The differences are:25962597\begin{itemize}25982599\item I replaced the list called \py{stack} with a deque called \py{queue}.26002601\item I replaced \py{pop} with \py{popleft}, which removes and returns2602the leftmost element of the queue.26032604\end{itemize}26052606This version is back to being $O(\V + \E)$. Now we're ready to2607find shortest paths.260826092610\section{Dijkstra's algorithm}2611\label{dijkstra}26122613Edsger W. Dijkstra was a Dutch computer scientist who invented an2614efficient shortest-path algorithm (see2615\url{https://thinkcomplex.com/dijk}). He also invented the semaphore, which is a data structure used to coordinate programs that2616communicate with each other (see2617\url{https://thinkcomplex.com/sem} and Downey, {\em The Little Book of Semaphores}).26182619\index{Dijkstra, Edsger}2620\index{Little Book of Semaphores@{\em The Little Book of Semaphores}}26212622Dijkstra is famous (and notorious) as the author of a series2623of essays on computer science.2624Some, like ``A Case against the GO TO Statement'',2625had a profound effect on programming practice.2626Others, like2627``On the Cruelty of Really Teaching Computing Science'', are2628entertaining in their cantankerousness, but less effective.26292630{\bf Dijkstra's algorithm} solves the ``single source shortest path2631problem'', which means that it finds the minimum distance from a given2632``source'' node to every other node in the graph (or at least every2633connected node).26342635\index{shortest path}2636\index{single source shortest path}2637\index{Dijkstra's algorithm}26382639I'll present a simplified version of the algorithm that2640considers all edges the same length. The more general version2641works with any non-negative edge lengths.26422643The simplified version is similar to the breadth-first search2644in the previous section except that we replace the set called2645\py{seen} with a dictionary called \py{dist}, which maps from each2646node to its distance from the source:26472648\begin{code}2649def shortest_path_dijkstra(G, source):2650dist = {source: 0}2651queue = deque([source])2652while queue:2653node = queue.popleft()2654new_dist = dist[node] + 126552656neighbors = set(G[node]).difference(dist)2657for n in neighbors:2658dist[n] = new_dist26592660queue.extend(neighbors)2661return dist2662\end{code}26632664Here's how it works:26652666\begin{itemize}26672668\item Initially, \py{queue} contains a single element, \py{source}, and \py{dist} maps from \py{source} to distance 0 (which is the distance from \py{source} to itself).26692670\item Each time through the loop, we use \py{popleft} to select the next node in the queue.26712672\item Next we find all neighbors of \py{node} that are not already in2673\py{dist}.26742675\item Since the distance from \py{source} to \py{node} is \py{dist[node]}, the2676distance to any of the undiscovered neighbors is \py{dist[node]+1}.26772678\item For each neighbor, we add an entry to \py{dist}, then we add2679the neighbors to the queue.26802681\end{itemize}26822683This algorithm only works if we use BFS, not DFS. To see why, consider this:26842685\index{BFS}2686\index{DFS}26872688\begin{enumerate}26892690\item The first time through the loop \py{node} is \py{source}, and \py{new_dist}2691is 1. So the neighbors of \py{source} get distance 1 and they2692go in the queue.26932694\item When we process the neighbors of \py{source}, all of {\em their}2695neighbors get distance 2. We know that none of them can have2696distance 1, because if they did, we would have discovered2697them during the first iteration.26982699\item Similarly, when we process the nodes with distance 2, we give2700their neighbors distance 3. We know that none of them can2701have distance 1 or 2, because if they did, we would have2702discovered them during a previous iteration.27032704\end{enumerate}27052706And so on. If you are familiar with proof by induction, you2707can see where this is going.27082709\index{proof by induction}27102711But this argument only works if we process all nodes with distance27121 before we start processing nodes with distance 2, and so on.2713And that's exactly what BFS does.27142715In the exercises at the end of this chapter, you'll write a version2716of Dijkstra's algorithm using DFS, so you'll have a chance to see2717what goes wrong.271827192720\section{Exercises}27212722\begin{exercise}27232724In a ring lattice, every node has the same number of neighbors. The2725number of neighbors is called the {\bf degree} of the node, and a2726graph where all nodes have the same degree is called a {\bf regular2727graph}.27282729\index{ring lattice}27302731All ring lattices are regular, but not all regular graphs are ring2732lattices. In particular, if \py{k} is odd, we can't construct a ring2733lattice, but we might be able to construct a regular graph.27342735Write a function called \py{make_regular_graph} that takes \py{n} and \py{k}2736and returns a regular graph that contains \py{n} nodes, where every node2737has \py{k} neighbors. If it's not possible to make a regular graph with2738the given values of \py{n} and \py{k}, the function should raise a2739\py{ValueError}.27402741\end{exercise}274227432744\begin{exercise}27452746My implementation of \py{reachable_nodes_bfs} is efficient in the sense2747that it is in $O(\V + \E)$, but it incurs a lot of overhead adding nodes2748to the queue and removing them. NetworkX provides a simple, fast2749implementation of BFS, available from the NetworkX repository on2750GitHub at \url{https://thinkcomplex.com/connx}.27512752Here is a version I modified to return a set of nodes:27532754\begin{code}2755def plain_bfs(G, start):2756seen = set()2757nextlevel = {start}2758while nextlevel:2759thislevel = nextlevel2760nextlevel = set()2761for v in thislevel:2762if v not in seen:2763seen.add(v)2764nextlevel.update(G[v])2765return seen2766\end{code}27672768Compare this function to \py{reachable_nodes_bfs} and see which is2769faster. Then see if you can modify this function to implement a2770faster version of \py{shortest_path_dijkstra}.27712772\end{exercise}277327742775\begin{exercise}27762777The following implementation of BFS2778contains two performance errors. What are2779they? What is the actual order of growth for this algorithm?27802781\index{BFS}2782\index{order of growth}2783\index{performance error}27842785\begin{code}2786def bfs(G, start):2787visited = set()2788queue = [start]2789while len(queue):2790curr_node = queue.pop(0) # Dequeue2791visited.add(curr_node)27922793# Enqueue non-visited and non-enqueued children2794queue.extend(c for c in G[curr_node]2795if c not in visited and c not in queue)2796return visited2797\end{code}27982799\end{exercise}280028012802\begin{exercise}2803In Section~\ref{dijkstra}, I claimed that Dijkstra's algorithm does2804not work unless it uses BFS. Write a version of2805\py{shortest_path_dijkstra} that uses DFS and test it on a few examples2806to see what goes wrong.2807\end{exercise}2808280928102811\begin{exercise}28122813%TODO: Do this exercise (project idea)28142815A natural question about the Watts and Strogatz paper is2816whether the small world phenomenon is specific to their2817generative model or whether other similar models yield2818the same qualitative result (high clustering and low path lengths).28192820\index{small world phenomenon}28212822To answer this question, choose a variation of the2823Watts and Strogatz model and repeat the experiment.2824There are two kinds of variation you might consider:28252826\begin{itemize}28272828\item Instead of starting with a regular graph, start with2829another graph with high clustering. For example, you could2830put nodes at random locations in a 2-D space2831and connect each node to its nearest $k$ neighbors.28322833\item Experiment with different kinds of rewiring.28342835\end{itemize}28362837If a range of similar models yield similar behavior, we2838say that the results of the paper are {\bf robust}.28392840\index{robust}28412842\end{exercise}284328442845\begin{exercise}28462847Dijkstra's algorithm solves the ``single source shortest path''2848problem, but to compute the characteristic path length of a graph,2849we actually want to solve the ``all pairs shortest path'' problem.28502851\index{all pairs shortest path}28522853Of course, one option is to run Dijkstra's algorithm $n$ times,2854once for each starting node. And for some applications, that's2855probably good enough. But there are are more efficient alternatives.28562857Find an algorithm for the all-pairs shortest path problem and2858implement it. See2859\url{https://thinkcomplex.com/short}.28602861Compare the run time of your implementation with running2862Dijkstra's algorithm $n$ times. Which algorithm is better in2863theory? Which is better in practice? Which one does NetworkX2864use?28652866% https://github.com/networkx/networkx/blob/master/networkx/algorithms/shortest_paths/unweighted.py28672868\end{exercise}2869287028712872\chapter{Scale-free networks}2873\label{scale-free}28742875\newcommand{\Barabasi}{Barab\'{a}si}28762877In this chapter, we'll work with data from an online social network, and use a2878Watts-Strogatz graph to model it. The WS model has characteristics of2879a small world network, like the data, but it has low2880variability in the number of neighbors from node to node,2881unlike the data.28822883This discrepancy is the motivation for a network model developed2884by \Barabasi~and Albert. The BA model captures the observed variability2885in the number of neighbors, and it has one of the small world2886properties, short path lengths, but it does not have the high2887clustering of a small world network.28882889The chapter ends with a discussion of WS and BA graphs as explanatory2890models for small world networks.28912892The code for this chapter is in {\tt chap04.ipynb} in the respository2893for this book. More information about working with the code is2894in Section~\ref{code}.289528962897\section{Social network data}28982899Watts-Strogatz graphs are intended to model networks in the natural2900and social sciences. In their original paper, Watts and Strogatz2901looked at the network of film actors (connected if they have appeared2902in a movie together); the electrical power grid in the western United2903States; and the network of neurons in the brain of the roundworm2904{\it C. elegans}. They found that all of these networks had the2905high connectivity and low path lengths characteristic of small world2906graphs.29072908\index{Watts-Strogatz graph}29092910In this section we'll perform the same analysis with a different2911dataset, a set of Facebook users and their friends. If you are not2912familiar with Facebook, users who are connected to each other are2913called ``friends'', regardless of the nature of their relationship in2914the real world.29152916\index{Facebook data}2917\index{SNAP}29182919I'll use data from the Stanford Network Analysis Project (SNAP), which2920shares large datasets from online social networks and other sources.2921Specifically, I'll use their Facebook data\footnote{J. McAuley and2922J. Leskovec. Learning to Discover Social Circles in Ego2923Networks. NIPS, 2012.}, which includes 4039 users and 88,2342924friend relationships among them. This dataset is in the repository2925for this book, but it is also available from the SNAP website at2926\url{https://thinkcomplex.com/snap}.29272928The data file contains one line per edge, with users identified by2929integers from 0 to 4038. Here's the code that reads the file:29302931\begin{code}2932def read_graph(filename):2933G = nx.Graph()2934array = np.loadtxt(filename, dtype=int)2935G.add_edges_from(array)2936return G2937\end{code}29382939NumPy provides a function called \py{loadtext} that reads the2940given file and returns the contents as a NumPy array. The2941parameter \py{dtype} indicates that the ``data type'' of the array2942is \py{int}.29432944\index{NumPy}2945\index{loadtext}2946\index{dtype}29472948Then we use \py{add_edges_from} to iterate the rows of the array2949and make edges. Here are the results:29502951\begin{code}2952>>> fb = read_graph('facebook_combined.txt.gz')2953>>> n = len(fb)2954>>> m = len(fb.edges())2955>>> n, m2956(4039, 88234)2957\end{code}29582959The node and edge counts are consistent with the documentation2960of the dataset.29612962Now we can check whether this dataset has the characteristics of2963a small world graph: high clustering and low path lengths.29642965In Section~\ref{clustering} we wrote a function to compute2966the network average clustering coefficient. NetworkX provides2967a function called \py{average_clustering}, which does the same2968thing a little faster.29692970\index{NetworkX}2971\index{average clustering}29722973But for larger graphs, they are both too slow, taking time2974proportional to $n k^2$, where $n$ is the number of nodes and2975$k$ is the number of neighbors each node is connected to.29762977Fortunately, NetworkX provides a function that estimates the2978clustering coefficient by random sampling. You can invoke it like2979this:29802981\begin{code}2982from networkx.algorithms.approximation import average_clustering2983average_clustering(G, trials=1000)2984\end{code}29852986The following function does something similar for path lengths.29872988\begin{code}2989def sample_path_lengths(G, nodes=None, trials=1000):2990if nodes is None:2991nodes = list(G)2992else:2993nodes = list(nodes)29942995pairs = np.random.choice(nodes, (trials, 2))2996lengths = [nx.shortest_path_length(G, *pair)2997for pair in pairs]2998return lengths2999\end{code}30003001\py{G} is a graph, \py{nodes} is the list of nodes to sample3002from, and \py{trials} is the number of random paths to sample.3003If \py{nodes} is \py{None}, we sample from the entire graph.30043005\index{NumPy}3006\index{random}3007\index{choice}30083009\py{pairs} is a NumPy array of randomly chosen nodes with3010one row for each trial and two columns.30113012\index{list comprehension}30133014The list comprehension enumerates the rows in the array and3015computes the shortest distance between each pair of nodes.3016The result is a list of path lengths.30173018\py{estimate_path_length} generates a list of random path lengths and3019returns their mean:30203021\begin{code}3022def estimate_path_length(G, nodes=None, trials=1000):3023return np.mean(sample_path_lengths(G, nodes, trials))3024\end{code}30253026I'll use \py{average_clustering} to compute $C$:30273028\begin{code}3029C = average_clustering(fb)3030\end{code}30313032And \py{estimate_path_lengths} to compute $L$:30333034\begin{code}3035L = estimate_path_lengths(fb)3036\end{code}30373038The clustering coefficient is about 0.61, which is high,3039as we expect if this network has the small world property.30403041And the average path is 3.7, which is3042quite short in a network of more than 4000 users. It's a small3043world after all.30443045Now let's see if we can construct a WS graph that has the same3046characteristics as this network.304730483049\section{WS Model}30503051\index{Watts-Strogatz graph}3052\index{Facebook data}30533054In the Facebook dataset, the average number of edges per node is about305522. Since each edge is connected to two nodes, the average degree3056is twice the number of edges per node:30573058\begin{code}3059>>> k = int(round(2*m/n))3060>>> k3061443062\end{code}30633064We can make a WS graph with \py{n=4039} and \py{k=44}. When \py{p=0}, we3065get a ring lattice.30663067\begin{code}3068lattice = nx.watts_strogatz_graph(n, k, 0)3069\end{code}30703071In this graph, clustering is high: \py{C} is 0.73, compared to 0.613072in the dataset. But \py{L} is 46, much higher3073than in the dataset!30743075With \py{p=1} we get a random graph:30763077\begin{code}3078random_graph = nx.watts_strogatz_graph(n, k, 1)3079\end{code}30803081In the random graph, \py{L} is 2.6, even shorter than3082in the dataset (3.7), but \py{C} is only 0.011, so that's no good.30833084By trial and error, we find that when \py{p=0.05} we get a WS graph with3085high clustering and low path length:30863087\begin{code}3088ws = nx.watts_strogatz_graph(n, k, 0.05, seed=15)3089\end{code}30903091In this graph \py{C} is 0.63, a bit higher than3092in the dataset, and \py{L} is 3.2, a bit lower than in the dataset.3093So this graph models the small world characteristics of the dataset3094well.30953096So far, so good.309730983099\section{Degree}3100\label{degree}31013102\begin{figure}3103\centerline{\includegraphics[width=5.5in]{figs/chap04-1.pdf}}3104\caption{PMF of degree in the Facebook dataset and in the WS model.}3105\label{chap04-1}3106\end{figure}31073108If the WS graph is a good model for the Facebook network,3109it should have the same average degree across nodes, and ideally the3110same variance in degree.31113112\index{degree}31133114This function returns a list of degrees in a graph, one for each node:31153116\begin{code}3117def degrees(G):3118return [G.degree(u) for u in G]3119\end{code}31203121The mean degree in model is 44, which is close to the mean degree in the dataset, 43.7.31223123However, the standard deviation of degree in the model is 1.5, which is not close to the standard deviation in the dataset, 52.4. Oops.31243125What's the problem? To get a better view, we have to look at the3126{\bf distribution} of degrees, not just the mean and standard deviation.31273128\index{degree distribution}3129\index{PMF object}3130\index{probability mass function}31313132I'll represent the distribution of degrees with a \py{Pmf} object,3133which is defined in the \py{thinkstats2} module.3134\py{Pmf} stands for ``probability mass function'';3135if you are not familiar with this concept,3136you might want to read Chapter 3 of {\it Think Stats, 2nd edition}3137at \url{https://thinkcomplex.com/ts2}.31383139Briefly, a \py{Pmf} maps from values to their probabilities.3140A \py{Pmf} of degrees is a mapping from each possible degree, $d$, to the3141fraction of nodes with degree $d$.31423143As an example, I'll construct a graph with nodes 1, 2, and 3 connected3144to a central node, 0:31453146\begin{code}3147G = nx.Graph()3148G.add_edge(1, 0)3149G.add_edge(2, 0)3150G.add_edge(3, 0)3151nx.draw(G)3152\end{code}31533154Here's the list of degrees in this graph:31553156\begin{code}3157>>> degrees(G)3158[3, 1, 1, 1]3159\end{code}31603161Node 0 has degree 3, the others have degree 1. Now I can make3162a \py{Pmf} that represents this degree distribution:31633164\begin{code}3165>>> from thinkstats2 import Pmf3166>>> Pmf(degrees(G))3167Pmf({1: 0.75, 3: 0.25})3168\end{code}31693170The result is a \py{Pmf} object that maps from each degree to a3171fraction or probability. In this example, 75\% of the nodes have3172degree 1 and 25\% have degree 3.31733174Now we can make a \py{Pmf} that contains node degrees from the3175dataset, and compute the mean and standard deviation:31763177\begin{code}3178>>> from thinkstats2 import Pmf3179>>> pmf_fb = Pmf(degrees(fb))3180>>> pmf_fb.Mean(), pmf_fb.Std()3181(43.691, 52.414)3182\end{code}31833184And the same for the WS model:31853186\begin{code}3187>>> pmf_ws = Pmf(degrees(ws))3188>>> pmf_ws.mean(), pmf_ws.std()3189(44.000, 1.465)3190\end{code}31913192We can use the \py{thinkplot} module to plot the results:31933194\begin{code}3195thinkplot.Pdf(pmf_fb, label='Facebook')3196thinkplot.Pdf(pmf_ws, label='WS graph')3197\end{code}31983199Figure~\ref{chap04-1} shows the two distributions. They are3200very different.32013202\index{thinkplot module}32033204In the WS model, most users have about 44 friends; the minimum is 383205and the maximum is 50. That's not much variation.3206In the dataset, there are many users with only 1 or 2 friends,3207but one has more than 1000!32083209\index{heavy-tailed distribution}32103211Distributions like this, with many small values and a few very large3212values, are called {\bf heavy-tailed}.321332143215\section{Heavy-tailed distributions}3216\label{heavytail}32173218\begin{figure}3219\centerline{\includegraphics[width=5.5in]{figs/chap04-2.pdf}}3220\caption{PMF of degree in the Facebook dataset and in the WS model,3221on a log-log scale.}3222\label{chap04-2}3223\end{figure}32243225Heavy-tailed distributions are a3226common feature in many areas of complexity science and they will be a3227recurring theme of this book.32283229We can get a clearer picture of a heavy-tailed distribution by3230plotting it on a log-log axis, as shown in Figure~\ref{chap04-2}.3231This transformation emphasizes the tail of the distribution; that3232is, the probabilities of large values.32333234\index{logarithm}3235\index{log-log axis}3236\index{power law}32373238\newcommand{\PMF}{\mathrm{PMF}}3239\newcommand{\CDF}{\mathrm{CDF}}3240\newcommand{\CCDF}{\mathrm{CCDF}}32413242Under this transformation, the data fall approximately on a3243straight line, which suggests that there is a {\bf power law}3244relationship between the largest values in the distribution and their3245probabilities. Mathematically, a distribution obeys a power law if3246%3247\[ \PMF(k) \sim k^{-\alpha} \]3248%3249where $\PMF(k)$ is the fraction of nodes with degree $k$, $\alpha$3250is a parameter, and the symbol $\sim$ indicates that the PMF is3251asymptotic to $k^{-\alpha}$ as $k$ increases.32523253If we take the log of both sides, we get3254%3255\[ \log \PMF(k) \sim -\alpha \log k \]3256%3257So if a distribution follows a power law and we plot $\PMF(k)$ versus3258$k$ on a log-log scale, we expect a straight line with slope3259$-\alpha$, at least for large values of $k$.32603261All power law distributions are heavy-tailed, but there are other3262heavy-tailed distributions that don't follow a power law. We will3263see more examples soon.32643265But first, we have a problem: the WS model has the high clustering3266and low path length we see in the data, but the degree distribution3267doesn't resemble the data at all. This discrepancy is the motivation3268for our next topic, the \Barabasi-Albert model.326932703271\section{\Barabasi-Albert model}3272\label{scale.free}32733274In 1999 \Barabasi~and Albert published a paper,3275``Emergence of Scaling in Random Networks'', that characterizes the3276structure of several real-world networks,3277including graphs that represent the interconnectivity of movie actors,3278web pages, and elements in the electrical power grid3279in the western United States. You can download the paper from3280\url{https://thinkcomplex.com/barabasi}.32813282\index{Barabasi-Albert graph@\Barabasi-Albert graph}3283\index{movie actor data}3284\index{world-wide web data}3285\index{electrical power grid data}32863287They measure the degree of each node and compute $\PMF(k)$, the3288probability that a vertex has degree $k$. Then they plot $\PMF(k)$3289versus $k$ on a log-log scale. The plots fit a3290straight line, at least for large values of $k$,3291so \Barabasi~and Albert conclude that these3292distributions are heavy-tailed.32933294They also propose a model that generates graphs with the same3295property. The essential features of the model, which distinguish it3296from the WS model, are:32973298\index{generative model}3299\index{growth}33003301\begin{description}33023303\item[Growth:] Instead of starting with a fixed number of vertices,3304the BA model starts with a small graph and adds vertices one at a time.33053306\item[Preferential attachment:] When a new edge is created, it is3307more likely to connect to a vertex that already has a large number3308of edges. This ``rich get richer'' effect is characteristic of3309the growth patterns of some real-world networks.33103311\index{preferential attachment}3312\index{rich get richer}33133314\end{description}33153316Finally, they show that graphs generated by the \Barabasi-Albert (BA)3317model have a degree distribution that obeys a power law.33183319\index{BA graph}33203321Graphs with this property are sometimes called {\bf scale-free networks},3322for reasons I won't explain; if you are curious, you can read more3323at \url{https://thinkcomplex.com/scale}.33243325\index{scale-free network}3326\index{NetworkX}33273328NetworkX provides a function that generates BA graphs. We will use3329it first; then I'll show you how it works.33303331\begin{code}3332ba = nx.barabasi_albert_graph(n=4039, k=22)3333\end{code}33343335The parameters are \py{n}, the number of nodes to generate, and3336\py{k}, the number of edges each node starts with when it is added to3337the graph. I chose \py{k=22} because that is the average number3338of edges per node in the dataset.33393340\begin{figure}3341\centerline{\includegraphics[width=5.5in]{figs/chap04-3.pdf}}3342\caption{PMF of degree in the Facebook dataset and in the BA model,3343on a log-log scale.}3344\label{chap04-3}3345\end{figure}33463347The resulting graph has 4039 nodes and 21.9 edges per node.3348Since every edge is connected to two nodes, the average degree3349is 43.8, very close to the average degree in the dataset,335043.7.33513352And the standard deviation of degree is 40.9, which is a bit3353less than in the dataset, 52.4, but it is much better3354than what we got from the WS graph, 1.5.33553356Figure~\ref{chap04-3} shows the degree distributions for the3357Facebook dataset and the BA model on a log-log scale. The model3358is not perfect; in particular, it deviates from the data when3359\py{k} is less than 10. But the tail looks like a straight line,3360which suggests that this process generates degree distributions3361that follow a power law.33623363\index{degree distribution}33643365So the BA model is better than the WS model at reproducing the degree3366distribution. But does it have the small world property?33673368\index{small world property}33693370In this example, the average path length, $L$, is $2.5$, which3371is even more ``small world'' than the actual network, which has3372$L=3.69$. So that's good, although maybe too good.33733374\index{clustering coefficient}33753376On the other hand, the clustering coefficient, $C$, is $0.037$,3377not even close to the value in the dataset, $0.61$.3378So that's a problem.33793380Table~\ref{table04-1} summarizes these results. The WS model captures3381the small world characteristics, but not the degree distribution. The3382BA model captures the degree distribution, at least approximately,3383and the average path length, but not the clustering coefficient.33843385In the exercises at the end of this chapter, you can explore other3386models intended to capture all of these characteristics.33873388\begin{table}[]3389\centering3390\begin{tabular}{lrrr}3391\hline3392& \textbf{Facebook} & \textbf{WS model} & \textbf{BA model} \\3393\hline3394C & 0.61 & 0.63 & 0.037 \\3395L & 3.69 & 3.23 & 2.51 \\3396Mean degree & 43.7 & 44 & 43.7 \\3397Std degree & 52.4 & 1.5 & 40.1 \\3398Power law? & maybe & no & yes \\3399\hline3400\end{tabular}3401\caption{Characteristics of the Facebook dataset compared to two models.}3402\label{table04-1}3403\end{table}340434053406\section{Generating BA graphs}34073408In the previous sections we used a NetworkX function to generate BA3409graphs; now let's see how it works. Here is a version3410of \py{barabasi_albert_graph}, with some changes I made to3411make it easier to read:34123413\begin{code}3414def barabasi_albert_graph(n, k):34153416G = nx.empty_graph(k)3417targets = list(range(k))3418repeated_nodes = []34193420for source in range(k, n):3421G.add_edges_from(zip([source]*k, targets))34223423repeated_nodes.extend(targets)3424repeated_nodes.extend([source] * k)34253426targets = _random_subset(repeated_nodes, k)34273428return G3429\end{code}34303431The parameters are \py{n}, the number of nodes we want, and \py{k}, the3432number of edges each new node gets (which will turn out to be3433the average number of edges per node).34343435\index{NetworkX}3436\index{zip}34373438We start with a graph that has \py{k} nodes and no edges. Then we3439initialize two variables:34403441\begin{description}34423443\item[\py{targets}:] The list of \py{k} nodes that will be connected3444to the next node. Initially \py{targets} contains the original3445\py{k} nodes; later it will contain a random subset of nodes.34463447\item[\py{repeated_nodes}:] A list of existing nodes where each3448node appears once for every edge it is connected to. When we3449select from \py{repeated_nodes}, the probability of selecting any3450node is proportional to the number of edges it has.34513452\end{description}34533454Each time through the loop, we add edges from the source to3455each node in \py{targets}. Then we update \py{repeated_nodes} by3456adding each target once and the new node \py{k} times.34573458Finally, we choose a subset of the nodes to be targets for the3459next iteration. Here's the definition of \py{_random_subset}:34603461\begin{code}3462def _random_subset(repeated_nodes, k):3463targets = set()3464while len(targets) < k:3465x = random.choice(repeated_nodes)3466targets.add(x)3467return targets3468\end{code}34693470Each time through the loop, \py{_random_subset} chooses from3471\py{repeated_nodes} and adds the chosen node to \py{targets}. Because3472\py{targets} is a set, it automatically discards duplicates, so3473the loop only exits when we have selected \py{k} different nodes.347434753476\section{Cumulative distributions}3477\label{cdf}34783479\begin{figure}3480\centerline{\includegraphics[width=5.5in]{figs/chap04-4.pdf}}3481\caption{CDF of degree in the Facebook dataset along with the WS model (left) and the BA model (right), on a log-x scale.}3482\label{chap04-4}3483\end{figure}34843485Figure~\ref{chap04-3} represents the degree distribution by plotting3486the probability mass function (PMF) on a log-log scale. That's how3487\Barabasi~and Albert present their results and it is the representation3488used most often in articles about power law distributions. But it3489is not the best way to look at data like this.34903491A better alternative is a {\bf cumulative distribution function}3492(CDF), which maps from a value, $x$, to the fraction of values less3493than or equal to $x$.34943495\index{cumulative distribution function}3496\index{CDF}34973498Given a \py{Pmf}, the simplest way to compute a cumulative probability3499is to add up the probabilities for values up to and including $x$:35003501\begin{code}3502def cumulative_prob(pmf, x):3503ps = [pmf[value] for value in pmf if value<=x]3504return np.sum(ps)3505\end{code}35063507For example, given the degree distribution in the dataset,3508\py{pmf_fb}, we can compute the fraction of users with 25 or fewer3509friends:35103511\begin{code}3512>>> cumulative_prob(pmf_fb, 25)35130.5063514\end{code}35153516The result is close to 0.5, which means that the median number3517of friends is about 25.35183519CDFs are better for visualization because they are less noisy than3520PMFs. Once you get used to interpreting CDFs, they provide3521a clearer picture of the shape of a3522distribution than PMFs.35233524The \py{thinkstats} module provides a class called \py{Cdf} that3525represents a cumulative distribution function. We can use it3526to compute the CDF of degree in the dataset.35273528\begin{code}3529from thinkstats2 import Cdf3530cdf_fb = Cdf(degrees(fb), label='Facebook')3531\end{code}35323533And \py{thinkplot} provides a function called \py{Cdf} that plots3534cumulative distribution functions.35353536\index{thinkplot module}35373538\begin{code}3539thinkplot.Cdf(cdf_fb)3540\end{code}35413542Figure~\ref{chap04-4} shows the degree CDF for the Facebook dataset3543along with the WS model (left) and the BA model (right). The x-axis3544is on a log scale.35453546\begin{figure}3547\centerline{\includegraphics[width=5.5in]{figs/chap04-5.pdf}}3548\caption{Complementary CDF of degree in the Facebook dataset along with the WS model (left) and the BA model (right), on a log-log scale.}3549\label{chap04-5}3550\end{figure}35513552Clearly the CDF for the WS model is very different from the CDF3553from the data. The BA model is better, but still not very good,3554especially for small values.35553556\index{WS model}3557\index{BA model}35583559In the tail of the distribution (values greater than 100) it looks3560like the BA model matches the dataset well enough, but it is3561hard to see. We can get a clearer view with one other view of the3562data: plotting the complementary CDF on a log-log scale.35633564The {\bf complementary CDF} (CCDF) is defined3565%3566\[ \CCDF(x) \equiv 1 - \CDF(x) \]3567%3568This definition is useful because if the PMF follows a power law, the CCDF3569also follows a power law:3570%3571\[ \CCDF(x) \sim \left( \frac{x}{x_m} \right)^{-\alpha} \]3572%3573where $x_m$ is the minimum possible value and $\alpha$ is a parameter3574that determines the shape of the distribution.35753576Taking the log of both sides yields:3577%3578\[ \log \CCDF(x) \sim -\alpha (\log x - \log x_m) \]3579%3580So if the distribution obeys a power law, we expect the CCDF on3581a log-log scale to be a straight line with slope $-\alpha$.35823583\index{logarithm}35843585Figure~\ref{chap04-5} shows the CCDF of degree for the Facebook data,3586along with the WS model (left) and the BA model (right), on a log-log3587scale.35883589With this way of looking at the data, we can see that the BA model3590matches the tail of the distribution (values above 20) reasonably well.3591The WS model does not.359235933594\section{Explanatory models}3595\label{model1}35963597\begin{figure}3598\centerline{\includegraphics[height=2in]{figs/model.pdf}}3599\caption{The logical structure of an explanatory model.\label{fig.model}}3600\end{figure}36013602We started the discussion of networks with Milgram's Small World3603Experiment, which shows that path lengths in social3604networks are surprisingly small; hence, ``six degrees of separation''.36053606\index{six degrees}3607\index{explanatory model}3608\index{system}3609\index{observable}3610\index{model}3611\index{behavior}36123613When we see something surprising, it is natural to ask ``Why?'' but3614sometimes it's not clear what kind of answer we are looking for. One3615kind of answer is an {\bf explanatory model} (see3616Figure~\ref{fig.model}). The logical structure of an explanatory3617model is:36183619\begin{enumerate}36203621\item In a system, S, we see something observable, O, that warrants3622explanation.36233624\item We construct a model, M, that is analogous to the system; that3625is, there is a correspondence between the elements of the model and3626the elements of the system.36273628\item By simulation or mathematical derivation, we show that the model3629exhibits a behavior, B, that is analogous to O.36303631\item We conclude that S exhibits O {\em because} S is similar to M, M3632exhibits B, and B is similar to O.36333634\end{enumerate}36353636At its core, this is an argument by analogy, which says that if two3637things are similar in some ways, they are likely to be similar in3638other ways.36393640\index{analogy}3641\index{argument by analogy}36423643Argument by analogy can be useful, and explanatory models can be3644satisfying, but they do not constitute a proof in the mathematical3645sense of the word.36463647\index{proof}3648\index{mathematical proof}36493650Remember that all models leave out, or ``abstract away'',3651details that we think are unimportant. For any system there3652are many possible models that include or ignore different features.3653And there might be models that exhibit different behaviors that are similar to O in different ways.3654In that case, which model explains O?36553656\index{abstract model}36573658The small world phenomenon is an example: the Watts-Strogatz (WS)3659model and the \Barabasi-Albert (BA) model both exhibit elements of3660small world behavior, but they offer different explanations:36613662\begin{itemize}36633664\item The WS model suggests that social networks are ``small'' because3665they include both strongly-connected clusters and ``weak ties'' that3666connect clusters (see \url{https://thinkcomplex.com/weak}).36673668\index{weak tie}36693670\item The BA model suggests that social networks are small because3671they include nodes with high degree that act as hubs, and that3672hubs grow, over time, due to preferential attachment.36733674\index{preferential attachment}36753676\end{itemize}36773678As is often the case in young areas of science, the problem is3679not that we have no explanations, but too many.368036813682\section{Exercises}36833684\begin{exercise}36853686In Section~\ref{model1} we discussed two explanations for the3687small world phenomenon, ``weak ties'' and ``hubs''.3688Are these explanations compatible; that is, can they both be right?3689Which do you find more satisfying as an explanation, and why?36903691\index{weak tie}3692\index{hub}36933694Is there data you could collect, or experiments you could perform,3695that would provide evidence in favor of one model over the other?36963697Choosing among competing models is the topic of Thomas Kuhn's3698essay, ``Objectivity, Value Judgment, and Theory Choice'', which3699you can read at \url{https://thinkcomplex.com/kuhn}.37003701\index{Kuhn, Thomas}3702\index{theory choice}3703\index{objectivity}37043705What criteria does Kuhn propose for choosing among competing models?3706Do these criteria influence your opinion about the WS and BA models?3707Are there other criteria you think should be considered?37083709\end{exercise}371037113712\begin{exercise}37133714NetworkX provides a function called \py{powerlaw_cluster_graph} that3715implements the "Holme and Kim algorithm for growing graphs with3716powerlaw degree distribution and approximate average clustering".3717Read the documentation of this function (\url{https://thinkcomplex.com/hk}) and see if you can use it to3718generate a graph that has the same number of nodes as the Facebook3719dataset, the same average degree, and the same clustering coefficient.3720How does the degree distribution in the model compare to the actual3721distribution?37223723\index{Holme-Kim graph}3724\index{HK model}37253726\end{exercise}372737283729\begin{exercise}37303731Data files from the \Barabasi~and Albert paper are available from3732\url{https://thinkcomplex.com/netdata}. Their actor collaboration data is included in the repository for this book in a3733file named \py{actor.dat.gz}. The following function reads the file and3734builds the graph.37353736\begin{code}3737import gzip37383739def read_actor_network(filename, n=None):3740G = nx.Graph()3741with gzip.open(filename) as f:3742for i, line in enumerate(f):3743nodes = [int(x) for x in line.split()]3744G.add_edges_from(thinkcomplexity.all_pairs(nodes))3745if n and i >= n:3746break3747return G3748\end{code}37493750Compute the number of actors in the graph and the average degree.3751Plot the PMF of degree on a log-log scale. Also plot the CDF of3752degree on a log-x scale, to see the general shape of the distribution,3753and on a log-log scale, to see whether the tail follows a power law.37543755Note: The actor network is not connected, so you might want to use3756\py{nx.connected_component_subgraphs} to find connected subsets of the3757nodes.37583759\end{exercise}376037613762\chapter{Cellular Automatons}3763\label{automatons}37643765A {\bf cellular automaton} (CA) is a model of a world with very simple3766physics. ``Cellular'' means that the world is divided into discrete3767chunks, called cells. An ``automaton'' is a machine that performs3768computations --- it could be a real machine, but more often the3769``machine'' is a mathematical abstraction or a computer simulation.37703771\index{cellular automaton}37723773This chapter presents experiments Stephen Wolfram performed3774in the 1980s, showing that some cellular automatons display3775surprisingly complicated behavior, including the ability to3776perform arbitrary computations.37773778\index{Wolfram, Stephen}37793780I discuss implications of these results, and at the end of the3781chapter I suggest methods for implementing CAs efficiently in Python.37823783The code for this chapter is in {\tt chap05.ipynb} in the repository3784for this book. More information about working with the code is3785in Section~\ref{code}.378637873788\section{A simple CA}37893790Cellular automatons\footnote{You might also see the plural3791``automata''.} are governed by rules that determine how the state of3792the cells changes over time.37933794\index{time step}37953796As a trivial example, consider a cellular automaton (CA) with3797a single cell. The state of the cell during time step $i$ is an integer, $x_i$. As an initial condition, suppose $x_0 = 0$.37983799\index{state}38003801Now all we need is a rule. Arbitrarily, I'll pick $x_{i+1} = x_{i} + 1$,3802which says that during each time step, the state of the CA gets3803incremented by 1. So this CA performs a simple calculation: it counts.38043805\index{rule}38063807But this CA is atypical; normally the number of3808possible states is finite. As an example, suppose a cell can only have one of two states, 0 or 1. For a 2-state CA, we could write a rule like3809$x_{i+1} = (x_{i} + 1) \% 2$, where $\%$ is the remainder (or3810modulus) operator.38113812The behavior of this CA is simple: it blinks. That is,3813the state of the cell switches between 0 and 1 during each time step.38143815Most CAs are {\bf deterministic}, which means that rules do not3816have any random elements; given the same initial state, they3817always produce the same result. But some CAs are nondeterministic;3818we will see examples later.38193820\index{deterministic}3821\index{nondeterministic}38223823The CA in this section has only one cell, so we can think of3824it as zero-dimensional. In the rest3825of this chapter, we explore one-dimensional (1-D) CAs; in the next chapter3826we explore two-dimensional CAs.38273828\index{1-D cellular automaton}382938303831\section{Wolfram's experiment}3832\label{onedim}38333834In the early 1980s Stephen Wolfram published a series of papers3835presenting a systematic study of 1-D CAs. He identified3836four categories of behavior, each more interesting than3837the last. You can read one of these papers, ``Statistical mechanics3838of cellular automata," at \url{https://thinkcomplex.com/ca}.38393840In Wolfram's experiments, the cells are arranged in a lattice (which you3841might remember from Section~\ref{watts}) where each cell is connected to two neighbors. The lattice can be finite, infinite, or arranged in a ring.38423843The rules that determine how the system evolves in time are3844based on the notion of a ``neighborhood'', which is the set3845of cells that determines the next state of a given cell.3846Wolfram's experiments use a 3-cell neighborhood: the cell itself3847and its two neighbors.38483849\index{neighborhood}38503851In these experiments, the cells have two states, denoted 0 and 1 or ``off" and ``on". A rule can be summarized by a table that maps from the3852state of the neighborhood (a tuple of three states) to the next state3853of the center cell. The following table shows an example:38543855\index{rule table}3856\index{state}38573858\centerline{3859\begin{tabular}{|c|c|c|c|c|c|c|c|c|}3860\hline3861prev & 111 & 110 & 101 & 100 & 011 & 010 & 001 & 000 \\3862\hline3863next & 0 & 0 & 1 & 1 & 0 & 0 & 1 & 0 \\3864\hline3865\end{tabular}}38663867The first row shows the eight states a3868neighborhood can be in. The second row shows the state of3869the center cell during the next time step. As a concise encoding3870of this table, Wolfram suggested reading the bottom row3871as a binary number; because 00110010 in binary is 50 in3872decimal, Wolfram calls this CA ``Rule 50''.38733874\index{Rule 50}38753876\begin{figure}3877\centerline{\includegraphics[height=1.5in]{figs/chap05-1.pdf}}3878\caption{Rule 50 after 10 time steps.}3879\label{chap05-1}3880\end{figure}38813882Figure~\ref{chap05-1} shows the effect of Rule 50 over 103883time steps. The first row shows the state of the system during the first3884time step; it starts with one cell ``on'' and the rest ``off''.3885The second row shows the state of the system during the3886next time step, and so on.38873888The triangular shape in the figure is typical of these CAs; is it a3889consequence of the shape of the neighborhood. In one time step, each3890cell influences the state of one neighbor in either direction. During3891the next time step, that influence can propagate one more cell in each3892direction. So each cell in the past has a ``triangle of influence''3893that includes all of the cells that can be affected by it.38943895\index{triangle of influence}389638973898\section{Classifying CAs}38993900\begin{figure}3901\centerline{\includegraphics[height=1.5in]{figs/chap05-3.pdf}}3902\caption{Rule 18 after 64 steps.}3903\label{chap05-3}3904\end{figure}39053906How many of these CAs are there?39073908Since each cell is either on or off, we can specify the state3909of a cell with a single bit. In a neighborhood with three cells,3910there are 8 possible configurations, so there are 8 entries3911in the rule tables. And since each entry contains a single bit,3912we can specify a table using 8 bits. With 8 bits, we can3913specify 256 different rules.39143915One of Wolfram's first experiments with CAs was to test all 2563916possibilities and classify them.39173918Examining the results visually, he proposed that the behavior of CAs3919can be grouped into four classes. Class 1 contains the simplest (and3920least interesting) CAs, the ones that evolve from almost any starting3921condition to the same uniform pattern. As a trivial example, Rule 03922always generates an empty pattern after one time step.39233924\index{classifying cellular automatons}39253926Rule 50 is an example of Class 2. It generates a simple pattern with3927nested structure, that is, a pattern that contains many smaller versions3928of itself. Rule 18 makes the nested structure is even clearer;3929Figure~\ref{chap05-3} shows what it looks like after 64 steps.39303931\index{Rule 18}39323933\newcommand{\Sierpinski}{Sierpi\'{n}ski}39343935This pattern resembles the \Sierpinski~triangle, which3936you can read about at \url{https://thinkcomplex.com/sier}.39373938\index{Sierpi\'{n}ski triangle}39393940Some Class 2 CAs generate patterns that are intricate and3941pretty, but compared to Classes 3 and 4, they are relatively3942simple.394339443945\section{Randomness}39463947\begin{figure}3948\centerline{\includegraphics[height=2.5in]{figs/chap05-4.pdf}}3949\caption{Rule 30 after 100 time steps.}3950\label{chap05-4}3951\end{figure}39523953Class 3 contains CAs that generate randomness.3954Rule 30 is an example; Figure~\ref{chap05-4} shows what it looks like3955after 100 time steps.39563957\index{randomness}3958\index{Class 3 behavior}3959\index{Rule 30}39603961Along the left side there is an apparent pattern, and on the right3962side there are triangles in various sizes, but the center seems3963quite random. In fact, if you take the center column and treat it as a3964sequence of bits, it is hard to distinguish from a truly random3965sequence. It passes many of the statistical tests people use3966to test whether a sequence of bits is random.39673968Programs that produce random-seeming numbers are called3969{\bf pseudo-random number generators} (PRNGs). They are not considered3970truly random because:39713972\index{pseudo-random number generator}3973\index{PRNG}39743975\begin{itemize}39763977\item Many of them produce sequences with regularities that can be3978detected statistically. For example, the original implementation of3979\py{rand} in the C library used a linear congruential generator that3980yielded sequences with easily detectable serial correlations.39813982\index{linear congruential generator}39833984\item Any PRNG that uses a finite amount of state (that is, storage)3985will eventually repeat itself. One of the characteristics of a3986generator is the {\bf period} of this repetition.39873988\index{period}39893990\item The underlying process is fundamentally deterministic, unlike3991some physical processes, like radioactive decay and thermal noise,3992that are considered to be fundamentally random.39933994\index{deterministic}39953996\end{itemize}39973998Modern PRNGs produce sequences that are statistically3999indistinguishable from random, and they can be implemented with4000periods so long that the universe will collapse before they repeat.4001The existence of these generators raises the question of whether there4002is any real difference between a good quality pseudo-random sequence4003and a sequence generated by a ``truly'' random process. In {\em A New4004Kind of Science}, Wolfram argues that there is not (pages 315--326).40054006\index{New Kind of Science@{\it A New Kind of Science}}400740084009\section{Determinism}4010\label{determinism}40114012The existence of Class 3 CAs is surprising. To explain how4013surprising, let me start with philosophical4014{\bf determinism} (see \url{https://thinkcomplex.com/deter}).4015Many philosophical stances are hard to define precisely because4016they come in a variety of flavors. I often find it useful4017to define them with a list of statements ordered from weak4018to strong:40194020\index{determinism}40214022\begin{description}40234024\item[D1:] Deterministic models can make accurate predictions4025for some physical systems.40264027\item[D2:] Many physical systems can be modeled by deterministic4028processes, but some are intrinsically random.40294030\item[D3:] All events are caused by prior events, but many4031physical systems are nevertheless fundamentally unpredictable.40324033\item[D4:] All events are caused by prior events, and can (at4034least in principle) be predicted.40354036\index{causation}40374038\end{description}40394040My goal in constructing this range is to make D1 so weak that4041virtually everyone would accept it, D4 so strong that almost no one4042would accept it, with intermediate statements that some people accept.40434044The center of mass of world opinion swings along this range in4045response to historical developments and scientific discoveries. Prior4046to the scientific revolution, many people regarded the working of the4047universe as fundamentally unpredictable or controlled by supernatural4048forces. After the triumphs of Newtonian mechanics, some optimists4049came to believe something like D4; for example, in 1814 Pierre-Simon4050Laplace wrote:40514052\index{Newtonian mechanics}4053\index{Laplace, Pierre-Simon}40544055\begin{quote}4056We may regard the present state of the universe as the effect of its4057past and the cause of its future. An intellect which at a certain4058moment would know all forces that set nature in motion, and all4059positions of all items of which nature is composed, if this intellect4060were also vast enough to submit these data to analysis, it would4061embrace in a single formula the movements of the greatest bodies of4062the universe and those of the tiniest atom; for such an intellect4063nothing would be uncertain and the future just like the past would be4064present before its eyes.4065\end{quote}40664067This ``intellect'' is now called ``Laplace's Demon''.4068See \url{https://thinkcomplex.com/demon}. The word4069``demon'' in this context has the sense of ``spirit'', with no4070implication of evil.40714072\index{Laplace's Demon}40734074Discoveries in the 19th and 20th centuries gradually dismantled4075Laplace's hope. Thermodynamics, radioactivity,4076and quantum mechanics posed successive challenges to strong4077forms of determinism.40784079\index{entropy}4080\index{radioactive decay}4081\index{quantum mechanics}40824083In the 1960s chaos theory showed that in some deterministic systems4084prediction is only possible over short time scales, limited by4085precision in the measurement of initial conditions.40864087\index{chaos}40884089Most of these systems are continuous in space (if not time) and4090nonlinear, so the complexity of their behavior is not entirely4091surprising. Wolfram's demonstration of complex behavior in simple4092cellular automatons is more surprising --- and disturbing, at least to a4093deterministic world view.40944095\index{complex behavior}4096\index{simple rules}40974098So far I have focused on scientific challenges to determinism, but the4099longest-standing objection is the apparent conflict between4100determinism and human free will. Complexity science provides4101a possible resolution of this conflict; I'll come4102back to this topic in Section~\ref{freewill}.41034104\index{free will}410541064107\section{Spaceships}4108\label{spaceships}41094110\begin{figure}4111\centerline{\includegraphics[height=2.5in]{figs/chap05-5.pdf}}4112\caption{Rule 110 after 100 time steps.}4113\label{chap05-5}4114\end{figure}41154116The behavior of Class 4 CAs is even more surprising. Several 1-D CAs,4117most notably Rule 110, are {\bf Turing complete}, which means that4118they can compute any computable function. This property, also called4119{\bf universality}, was proved by Matthew Cook in 1998. See4120\url{https://thinkcomplex.com/r110}.41214122\index{Turing complete}4123\index{universality}4124\index{Cook, Matthew}41254126Figure~\ref{chap05-5} shows what Rule 110 looks like with an initial4127condition of a single cell and 100 time steps.4128At this time scale it is not apparent that anything special is4129going on. There are some regular patterns but also some features4130that are hard to characterize.41314132\index{Rule 110}41334134Figure~\ref{chap05-6} shows a bigger picture, starting with a random4135initial condition and 600 time steps:41364137\begin{figure}4138\centerline{\includegraphics[width=5.5in,height=5.5in]{figs/chap05-6.pdf}}4139\caption{Rule 110 with random initial conditions and 600 time steps.}4140\label{chap05-6}4141\end{figure}41424143After about 100 steps the background settles into a simple repeating4144pattern, but there are a number of persistent structures that appear4145as disturbances in the background. Some of these structures4146are stable, so they appear as vertical lines. Others translate in4147space, appearing as diagonals with different slopes, depending on4148how many time steps they take to shift by one column. These4149structures are called {\bf spaceships}.41504151\index{spaceship}41524153Collisions between spaceships yield different results4154depending on the types of the spaceships and the phase they are in4155when they collide. Some collisions annihilate both ships; others4156leave one ship unchanged; still others yield one or more ships of4157different types.41584159These collisions are the basis of computation in a Rule 110 CA. If4160you think of spaceships as signals that propagate through space, and4161collisions as gates that compute logical operations like AND and OR,4162you can see what it means for a CA to perform a computation.416341644165\section{Universality}41664167To understand universality, we have to understand computability4168theory, which is about models of computation and what they compute.41694170\index{universality}41714172One of the most general models of computation is the Turing machine,4173which is an abstract computer proposed by Alan Turing in 1936. A4174Turing machine is a 1-D CA, infinite in both directions, augmented4175with a read-write head. At any time, the head is positioned over a4176single cell. It can read the state of that cell (usually there are4177only two states) and it can write a new value into the cell.41784179\index{Turing machine}4180\index{Turing, Alan}41814182In addition, the machine has a register, which records the state4183of the machine (one of a finite number of states), and a table4184of rules. For each machine state and cell state, the table4185specifies an action. Actions include modifying the cell4186the head is over and moving one cell to the left or right.41874188\index{register}4189\index{tape}4190\index{read-write head}4191\index{cell}41924193A Turing machine is not a practical design for a computer, but it4194models common computer architectures. For a given program running on4195a real computer, it is possible (at least in principle) to construct a4196Turing machine that performs an equivalent computation.41974198The Turing machine is useful because it is possible to characterize4199the set of functions that can be computed by a Turing machine,4200which is what Turing did. Functions in this set are4201called ``Turing computable".42024203\index{computable function}42044205To say that a Turing machine can compute any Turing-computable4206function is a tautology: it is true by definition. But4207Turing-computability is more interesting than that.42084209\index{tautology}42104211It turns out that just about every reasonable model of computation4212anyone has come up with is ``Turing complete"; that is, it can compute4213exactly the same set of functions as the Turing machine.4214Some of these models, like lamdba calculus, are very different4215from a Turing machine, so their equivalence is surprising.42164217\index{lambda calculus}4218\index{Church-Turing thesis}42194220This observation led to the Church-Turing Thesis, which is the claim that these definitions of computability capture something essential that is independent of any particular model of computation.42214222The Rule 110 CA is yet another model of computation, and remarkable4223for its simplicity. That it, too, turns out to be Turing complete lends4224support to the Church-Turing Thesis.42254226In {\em A New Kind of Science}, Wolfram states a variation of this4227thesis, which he calls the ``principle of computational equivalence'' (see4228\url{https://thinkcomplex.com/equiv}):42294230\index{principle of computational equivalence}4231\index{New Kind of Science@{\it A New Kind of Science}}42324233\begin{quote}4234Almost all processes that are not obviously simple can be viewed as4235computations of equivalent sophistication.42364237More specifically, the principle of computational equivalence says4238that systems found in the natural world can perform computations up to4239a maximal (``universal'') level of computational power, and that most4240systems do in fact attain this maximal level of computational4241power. Consequently, most systems are computationally4242equivalent.4243\end{quote}42444245Applying these definitions to CAs, Classes 1 and 2 are ``obviously4246simple''. It may be less obvious that Class 3 is simple, but in a way4247perfect randomness is as simple as perfect order; complexity happens4248in between. So Wolfram's claim is that Class 4 behavior is common in4249the natural world, and that almost all systems that manifest it4250are computationally equivalent.42514252\index{Class 4 cellular automaton}425342544255\section{Falsifiability}42564257Wolfram holds that his principle is a stronger claim than the4258Church-Turing thesis because it is about the natural world rather4259than abstract models of computation. But saying that natural processes4260``can be viewed as computations'' strikes me as a statement about4261theory choice more than a hypothesis about the natural world.42624263\index{falsifiability}42644265Also, with qualifications like4266``almost'' and undefined terms like ``obviously simple'', his4267hypothesis may be {\bf unfalsifiable}. Falsifiability is4268an idea from the philosophy of science, proposed by Karl Popper4269as a demarcation between scientific hypotheses and pseudoscience.4270A hypothesis is falsifiable if there is an experiment, at least4271in the realm of practicality, that would contradict the hypothesis4272if it were false.42734274\index{Popper, Karl}42754276For example, the claim that all life on earth is descended4277from a common ancestor is falsifiable because it makes specific4278predictions about similarities in the genetics of modern species4279(among other things). If we discovered a new species whose4280DNA was almost entirely different from ours, that would4281contradict (or at least bring into question) the theory of4282universal common descent.42834284\index{universal common descent}42854286On the other hand, ``special creation'', the claim that all species4287were created in their current form by a supernatural agent, is4288unfalsifiable because there is nothing that we could observe about the4289natural world that would contradict it. Any outcome of any experiment4290could be attributed to the will of the creator.42914292\index{special creation}42934294Unfalsifiable hypotheses can be appealing because4295they are impossible to refute. If your goal is never to be4296proved wrong, you should choose hypotheses that are as4297unfalsifiable as possible.42984299But if your goal is to make reliable predictions about the world --- and4300this is at least one of the goals of science --- unfalsifiable4301hypotheses are useless. The problem is that they have4302no consequences (if they had consequences, they would be4303falsifiable).43044305\index{prediction}43064307For example, if the theory of special creation were true, what good4308would it do me to know it? It wouldn't tell me anything about the4309creator except that he has an ``inordinate fondness for beetles''4310(attributed to J.~B.~S.~Haldane). And unlike the4311theory of common descent, which informs many areas of science4312and bioengineering, it would be of no use for understanding4313the world or acting in it.43144315\index{Haldane, J.~B.~S.}4316\index{beetles}431743184319\section{What is this a model of?}4320\label{model3}43214322\begin{figure}4323\centerline{\includegraphics[height=2.5in]{figs/model3.pdf}}4324\caption{The logical structure of a simple physical model.}4325\label{fig.model3}4326\end{figure}43274328Some cellular automatons are primarily mathematical artifacts. They are4329interesting because they are surprising, or useful, or pretty, or4330because they provide tools for creating new mathematics (like the4331Church-Turing thesis).43324333\index{mathematics}43344335But it is not clear that they are models of physical systems. And if4336they are, they are highly abstracted, which is to say that they are4337not very detailed or realistic.43384339\index{physical model}43404341For example, some species of cone snail produce a pattern on their4342shells that resembles the patterns generated by cellular automatons4343(see \url{https://thinkcomplex.com/cone}). So it is natural to4344suppose that a CA is a model of the mechanism that produces patterns4345on shells as they grow. But, at least initially, it is not clear how4346the elements of the model (so-called cells, communication between4347neighbors, rules) correspond to the elements of a growing snail (real4348cells, chemical signals, protein interaction networks).43494350\index{cone snail}4351\index{abstract model}43524353For conventional physical models, being realistic is a virtue. If the elements of a model correspond to the elements of a physical system, there is an obvious analogy between the model and the system. In general, we expect a model that is more realistic to make better predictions and to provide more believable4354explanations.43554356\index{realistic model}43574358Of course, this is only true up to a point. Models that are4359more detailed are harder to work with, and usually less4360amenable to analysis. At some point, a model becomes so complex4361that it is easier to experiment with the system.43624363At the other extreme, simple models can be compelling exactly because4364they are simple.43654366Simple models offer a different kind of explanation than detailed4367models. With a detailed model, the argument goes something4368like this: ``We are interested in physical system S, so we4369construct a detailed model, M, and show by analysis and simulation4370that M exhibits a behavior, B, that is similar (qualitatively4371or quantitatively) to an observation of the real system, O.4372So why does O happen? Because S is similar to M, and4373B is similar to O, and we can prove that M leads to B.''43744375\index{argument by analogy}43764377With simple models we can't claim that S is similar to M, because it4378isn't. Instead, the argument goes like this: ``There is a set of models4379that share a common set of features. Any model that has these4380features exhibits behavior B. If we make an observation, O, that4381resembles B, one way to explain it is to show that the system, S, has4382the set of features sufficient to produce B.''43834384For this kind of argument, adding more features doesn't help. Making4385the model more realistic doesn't make the model more reliable; it only4386obscures the difference between the essential features that cause B4387and the incidental features that are particular to S.43884389Figure~\ref{fig.model3} shows the logical structure of this kind of4390model. The features $x$ and $y$ are sufficient to produce the4391behavior. Adding more detail, like features $w$ and $z$, might make4392the model more realistic, but that realism adds no explanatory power.439343944395\section{Implementing CAs}4396439743984399To generate the figures in this chapter, I wrote a Python class called4400\py{Cell1D} that represents a 1-D cellular automaton, and a class called4401\py{Cell1DViewer} that plots the results. Both are defined in \py{Cell1D.py} in the repository for this book.44024403\index{Cell1D}4404\index{Cell1DViewer}44054406To store the state of the CA, I use a NumPy array with one column for4407each cell and one row for each time step.44084409To explain how my implementation works, I'll start with a CA that computes the4410parity of the cells in each neighborhood. The ``parity'' of a number4411is 0 if the number is even and 1 if it is odd.44124413\index{parity}44144415I use the NumPy function \py{zeros} to create an array of zeros, then put a 1 in the middle of the first row.44164417\begin{code}4418rows = 54419cols = 114420array = np.zeros((rows, cols), dtype=np.uint8)4421array[0, 5] = 14422print(array)44234424[[ 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]4425[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]4426[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]4427[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]4428[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]4429\end{code}44304431The data type \py{uint8} indicates that the elements of \py{array} are unsigned 8-bit integers.44324433\index{NumPy}4434\index{zeros}4435\index{dtype}4436\index{uint8}44374438\py{plot_ca} displays the elements of an \py{array} graphically:44394440\begin{code}4441import matplotlib.pyplot as plt44424443def plot_ca(array, rows, cols):4444cmap = plt.get_cmap('Blues')4445plt.imshow(array, cmap=cmap, interpolation='none')4446\end{code}44474448I import \py{pyplot} with the abbreviated name4449\py{plt}, which is conventional. The function \py{get_cmap} returns a colormap, which maps from the values in the array to colors. The colormap \py{'Blues'} draws the ``on" cells in dark blue and the ``off" cells in light blue.44504451\index{imshow}4452\index{colormap}4453\index{pyplot module}44544455\py{imshow} displays the array as an ``image''; that is, it draws a colored square for each element of the array. Setting \py{interpolation} to4456\py{none} indicates that \py{imshow} should not interpolate between on and off cells.44574458To compute the state of the CA during time step \py{i}, we have to add up consecutive elements of \py{array} and compute the parity of the sum. We can due that using a slice operator to select the elements and the modulus operator to compute parity:44594460\index{slice operator}4461\index{modulus operator}44624463\begin{code}4464def step(array, i):4465rows, cols = array.shape4466row = array[i-1]4467for j in range(1, cols):4468elts = row[j-1:j+2]4469array[i, j] = sum(elts) % 24470\end{code}44714472\py{rows} and \py{cols} are the dimensions of the array. \py{row} is the4473previous row of the array.44744475Each time through the loop, we select three elements from \py{row}, add them up, compute the parity, and store the result in row \py{i}.44764477In this example, the lattice is finite, so the first and last cells have only one neighbor. To handle this special case, I don't update the first and last column; they are always 0.447844794480\section{Cross-correlation}4481\label{cross-correlation}44824483The operation in the previous section --- selecting elements4484from an array and adding them up --- is an example of an operation that is so useful, in so many domains, that it has a name: {\bf cross-correlation}.4485And NumPy provides a function, called \py{correlate}, that computes it.4486In this section I'll show how we can use NumPy to write a simpler, faster version of \py{step}.44874488\index{NumPy}4489\index{correlate}4490\index{cross-correlation}4491\index{window}44924493The NumPy \py{correlate} function takes an array, $a$, and a ``window", $w$, with length $N$ and computes a new array, $c$, where element \py{k} is the following summation:4494%4495\[ c_k = \sum_{n=0}^{N-1} a_{n+k} \cdot w_n \]4496%4497We can write this operation in Python like this:44984499\begin{code}4500def c_k(a, w, k):4501N = len(w)4502return sum(a[k:k+N] * w)4503\end{code}45044505This function computes element \py{k} of the correlation between \py{a} and \py{w}. To show how it works, I'll create an array of integers:45064507\begin{code}4508N = 104509row = np.arange(N, dtype=np.uint8)4510print(row)45114512[0 1 2 3 4 5 6 7 8 9]4513\end{code}45144515And a window:45164517\begin{code}4518window = [1, 1, 1]45194520print(window)4521\end{code}45224523With this window, each element, \py{c_k}, is the sum of consecutive elements from \py{a}:45244525\begin{code}4526c_k(row, window, 0)4527345284529c_k(row, window, 1)453064531\end{code}45324533We can use \py{c_k} to write \py{correlate}, which computes the elements of \py{c} for all values of \py{k} where the window and the array overlap.45344535\begin{code}4536def correlate(row, window):4537cols = len(row)4538N = len(window)4539c = [c_k(row, window, k) for k in range(cols-N+1)]4540return np.array(c)4541\end{code}45424543Here's the result:45444545\begin{code}4546c = correlate(row, window)4547print(c)45484549[ 3 6 9 12 15 18 21 24]4550\end{code}45514552The NumPy function \py{correlate} does the same thing:45534554\begin{code}4555c = np.correlate(row, window, mode='valid')4556print(c)45574558[ 3 6 9 12 15 18 21 24]4559\end{code}45604561The argument \py{mode='valid'} means that the result contains only the elements where the window and array overlap, which are considered valid.45624563The drawback of this mode is that the result is not the same size as \py{array}. We can fix that with \py{mode='same'}, which adds zeros to the beginning and end of \py{array}:45644565\begin{code}4566c = np.correlate(row, window, mode='same')4567print(c)45684569[ 1 3 6 9 12 15 18 21 24 17]4570\end{code}45714572Now the result is the same size as \py{array}. As an exercise at the end of this chapter, you'll have a chance to write a version of \py{correlate} that does the same thing.45734574We can use NumPy's implementation of \py{correlate} to write a simple, faster version of \py{step}:45754576\begin{code}4577def step2(array, i, window=[1,1,1]):4578row = array[i-1]4579c = np.correlate(row, window, mode='same')4580array[i] = c % 24581\end{code}45824583In the notebook for this chapter, you'll see that \py{step2} yields the same results as \py{step}.458445854586\section{CA tables}4587\label{tables}45884589The function we have so far works if the CA is ``totalitic", which means that the rules only depend on the sum of the neighbors. But most rules also depend on which neighbors are on and off. For example, \py{100} and \py{001} have the same sum, but for many CAs, they would yield different results.45904591We can make \py{step2} more general using a window with elements4592\py{[4, 2, 1]}, which interprets the neighborhood as a binary number.4593For example, the neighborhood \py{100} yields 4; \py{010} yields 2,4594and \py{001} yields 1. Then we can take these results and look them4595up in the rule table.45964597Here's the more general version of \py{step2}:45984599\begin{code}4600def step3(array, i, window=[4,2,1]):4601row = array[i-1]4602c = np.correlate(row, window, mode='same')4603array[i] = table[c]4604\end{code}46054606The first two lines are the same. Then the last line looks4607up each element from \py{c} in \py{table} and assigns the4608result to \py{array[i]}.46094610Here's the function that computes the table:46114612\begin{code}4613def make_table(rule):4614rule = np.array([rule], dtype=np.uint8)4615table = np.unpackbits(rule)[::-1]4616return table4617\end{code}46184619The parameter, \py{rule}, is an integer between 0 and 255.4620The first line puts \py{rule} into an array with a single element4621so we can use \py{unpackbits}, which converts the rule number4622to its binary representation. For4623example, here's the table for Rule 150:46244625\index{NumPy}4626\index{unpackbits}46274628\begin{code}4629>>> table = make_table(150)4630>>> print(table)4631[0 1 1 0 1 0 0 1]4632\end{code}46334634The code in this section is encapsulated in the \py{Cell1D} class, defined in \py{Cell1D.py} in the repository for this book.463546364637\section{Exercises}46384639The code for this chapter is in the Jupyter notebook {\tt chap05.ipynb}4640in the repository for this book. Open this notebook, read the code,4641and run the cells. You can use this notebook to work on the4642exercises in this chapter. My solutions are in {\tt chap05soln.ipynb}.464346444645\begin{exercise}4646Write a version of \py{correlate} that returns the same result as \py{np.correlate} with \py{mode='same'}. Hint: use the NumPy function \py{pad}.4647\end{exercise}46484649\index{NumPy}4650\index{pad}4651\index{correlate}46524653\begin{exercise}46544655This exercise asks you to experiment with Rule 110 and4656some of its spaceships.46574658\index{Rule 110}4659\index{spaceship}4660\index{background pattern}46614662\begin{enumerate}46634664\item Read the Wikipedia page about Rule 110, which describes its4665background pattern and spaceships:4666\url{https://thinkcomplex.com/r110}.46674668\item Create a Rule 110 CA with an initial condition that yields the4669stable background pattern.46704671Note that the \py{Cell1D} class provides4672\py{start_string}, which allows you to initialize the state of4673the array using a string of \py{1}s and \py{0}s.46744675\item Modify the initial condition by adding different patterns in the4676center of the row and see which ones yield spaceships. You might4677want to enumerate all possible patterns of $n$ bits, for some4678reasonable value of $n$. For each spaceship, can you find the4679period and rate of translation? What is the biggest spaceship you4680can find?46814682\item What happens when spaceships collide?46834684\end{enumerate}46854686\end{exercise}4687468846894690\begin{exercise}46914692The goal of this exercise is to implement a Turing machine.46934694\begin{enumerate}46954696\item Read about Turing machines at \url{https://thinkcomplex.com/tm}.46974698\item Write a class called \py{Turing} that implements a Turing machine.4699For the action table, use the rules for a 3-state busy beaver.47004701\index{Turing machine}4702\index{busy beaver}47034704\item Write a class named \py{TuringViewer} that generates an4705image that represents the state of the tape and the position and4706state of the head. For one example of what that might look like,4707see \url{https://thinkcomplex.com/turing}.47084709\end{enumerate}47104711\end{exercise}471247134714\begin{exercise}47154716This exercise asks you to implement and test several PRNGs.4717For testing, you will need to install4718\py{DieHarder}, which you can download from4719\url{https://thinkcomplex.com/dh}, or it4720might be available as a package for your operating system.47214722\index{DieHarder}47234724\begin{enumerate}47254726\item Write a program that implements one of the linear congruential4727generators described at4728\url{https://thinkcomplex.com/lcg}.4729Test it using \py{DieHarder}.47304731\item Read the documentation of Python's \py{random} module.4732What PRNG does it use? Test it.4733\index{random module@\py{random} module}47344735\item Implement a Rule 30 CA with a few hundred cells,4736run it for as many time steps as you can in a reasonable amount4737of time, and output the center column as a sequence of bits.4738Test it.4739\index{Rule 30}47404741%TODO: Do this exercise (project idea).47424743\end{enumerate}47444745\end{exercise}474647474748\begin{exercise}47494750Falsifiability is an appealing and useful idea, but among philosophers of science it is not generally accepted as a solution to the demarcation problem, as Popper claimed.47514752Read \url{https://thinkcomplex.com/false} and answer the4753following questions.47544755\begin{enumerate}47564757\item What is the demarcation problem?47584759\index{demarcation problem}4760\index{Popper, Karl}4761\index{falsifiability}47624763\item How, according to Popper, does falsifiability solve the4764demarcation problem?47654766\item Give an example of two theories, one considered scientific4767and one considered unscientific, that are successfully distinguished4768by the criterion of falsifiability.47694770\item Can you summarize one or more of the objections that4771philosophers and historians of science have raised to Popper's4772claim?47734774\item Do you get the sense that practicing philosophers think4775highly of Popper's work?47764777\end{enumerate}47784779\end{exercise}47804781478247834784\chapter{Game of Life}4785\label{lifechap}47864787In this chapter we consider two-dimensional cellular automatons,4788especially John Conway's Game of Life (GoL). Like some of4789the 1-D CAs in the previous chapter, GoL follows simple rules and4790produces surprisingly complicated behavior. And like Wolfram's4791Rule 110, GoL turns out to be universal; that is, it can compute4792any computable function, at least in theory.47934794\index{computable function}4795\index{universal}4796\index{Turing complete}47974798Complex behavior in GoL raises issues in the philosophy of4799science, particularly related to scientific realism and instrumentalism.4800I discuss these issues and suggest additional reading.48014802At the end of the chapter, I demonstrate ways to implement4803GoL efficiently in Python.48044805The code for this chapter is in {\tt chap06.ipynb} in the repository4806for this book. More information about working with the code is4807in Section~\ref{code}.480848094810\section{Conway's GoL}4811\label{life}48124813One of the first cellular automatons to be studied, and probably the4814most popular of all time, is a 2-D CA called ``The Game of Life'', or GoL4815for short. It was developed by John H. Conway and popularized in 19704816in Martin Gardner's column in {\em Scientific American}.4817See \url{https://thinkcomplex.com/gol}.48184819\index{Game of Life}4820\index{Conway, John H.}4821\index{Gardner, Martin}4822\index{grid}48234824The cells in GoL are arranged in a 2-D {\bf grid}, that is, an array of rows and columns. Usually the grid is considered to be infinite, but in practice it is often ``wrapped''; that is, the right edge is connected to the left, and the top edge to the bottom.48254826Each cell in the grid has two states --- live and dead --- and 8 neighbors --- north, south, east, west, and the four diagonals. This set of neighbors4827is sometimes called a ``Moore neighborhood''.48284829\index{Moore neighborhood}4830\index{neighborhood}48314832Like the 1-D CAs in the previous chapters, GoL evolves over time according to rules, which are like simple laws of physics.48334834In GoL, the next state of each cell depends on its current state and4835its number of live neighbors. If a cell is alive, it stays alive if it4836has 2 or 3 neighbors, and dies otherwise. If a cell is4837dead, it stays dead unless it has exactly 3 neighbors.48384839This behavior is loosely analogous to real cell growth: cells4840that are isolated or overcrowded die; at moderate densities they4841flourish.48424843GoL is popular because:48444845\begin{itemize}48464847\item There are simple initial conditions that yield4848surprisingly complex behavior.48494850\index{complex behavior}48514852\item There are many interesting stable patterns: some4853oscillate (with various periods) and some move like the4854spaceships in Wolfram's Rule 110 CA.48554856\item And like Rule 110, GoL is Turing complete.48574858\index{Turing complete}4859\index{universal}48604861\item Another factor that generated interest was Conway's conjecture --- that there is no initial condition that yields unbounded growth in the number4862of live cells --- and the \$50 bounty he offered to anyone who could prove4863or disprove it.48644865\index{unbounded}48664867\item Finally, the increasing availability of computers made it4868possible to automate the computation and display the results4869graphically.48704871\end{itemize}4872487348744875\section{Life patterns}4876\label{lifepatterns}48774878\begin{figure}4879\centerline{\includegraphics[height=1.75in]{figs/chap06-1.pdf}}4880\caption{A stable pattern called a beehive.}4881\label{chap06-1}4882\end{figure}48834884\begin{figure}4885\centerline{\includegraphics[height=1.75in]{figs/chap06-2.pdf}}4886\caption{An oscillator called a toad.}4887\label{chap06-2}4888\end{figure}48894890\begin{figure}4891\centerline{\includegraphics[height=1.75in]{figs/chap06-3.pdf}}4892\caption{A spaceship called a glider.}4893\label{chap06-3}4894\end{figure}48954896If you run GoL from a random starting state, a number of stable4897patterns are likely to appear. Over time, people have identified4898these patterns and given them names.48994900\index{Game of Life patterns}4901\index{beehive}49024903For example, Figure~\ref{chap06-1} shows a stable pattern called a4904``beehive''. Every cell in the beehive4905has 2 or 3 neighbors, so they all survive, and none of the dead4906cells adjacent to the beehive has 3 neighbors, so no new cells4907are born.49084909Other patterns ``oscillate''; that is, they change over time but4910eventually return to their starting configuration (provided4911they don't collide with another pattern). For example,4912Figure~\ref{chap06-2} shows a pattern called a ``toad'', which4913is an oscillator that alternates between two states. The4914``period'' of this oscillator is 2.49154916\index{oscillator}4917\index{toad}49184919Finally, some patterns oscillate and return to the starting4920configuration, but shifted in space. Because these patterns4921seem to move, they are called ``spaceships''.49224923Figure~\ref{chap06-3} shows a spaceship called a4924``glider''. After a period of 4 steps, the glider is back in the4925starting configuration, shifted one unit down and to the right.49264927\index{glider}4928\index{spaceship}49294930Depending on the starting orientation, gliders can move along any4931of the four diagonals. There are other spaceships that4932move horizontally and vertically.49334934People have spent embarrassing4935amounts of time finding and naming these patterns. If you search4936the web, you will find many collections.49374938\section{Conway's conjecture}49394940From most initial conditions, GoL quickly reaches a stable4941state where the number of live cells is nearly constant4942(possibly with some oscillation).49434944\begin{figure}4945\centerline{\includegraphics[height=1.75in]{figs/chap06-4.pdf}}4946\caption{Starting and final configurations of the r-pentomino.}4947\label{chap06-4}4948\end{figure}49494950But there are some simple starting conditions that4951yield a surprising number of live cells, and take a4952long time to settle down. Because these patterns are so long-lived, they4953are called ``Methuselahs''.49544955\index{Methuselah}4956\index{Conway's conjecture}49574958One of the simplest Methuselahs is the4959r-pentomino, which has only five cells, roughly in the shape of the4960letter ``r''. Figure~\ref{chap06-4} shows the initial configuration4961of the r-pentomino and the final configuration after 1103 steps.49624963This configuration is ``final'' in the sense that all remaining4964patterns are either stable, oscillators, or gliders that will never4965collide with another pattern. In total, the r-pentomino yields 64966gliders, 8 blocks, 4 blinkers, 4 beehives, 1 boat, 1 ship, and 1 loaf.49674968\index{r-pentomino}496949704971\begin{figure}4972\centerline{\includegraphics[height=1.75in]{figs/chap06-5.pdf}}4973\caption{Gosper's glider gun, which produces a stream of gliders.}4974\label{chap06-5}4975\end{figure}49764977The existence of long-lived patterns prompted Conway to wonder if4978there are initial patterns that never stabilize. He4979conjectured that there were not, but he described two kinds of pattern4980that would prove him wrong, a ``gun'' and a ``puffer train''. A gun4981is a stable pattern that periodically produces a spaceship --- as the4982stream of spaceships moves out from the source, the number of live4983cells grows indefinitely. A puffer train is a translating pattern4984that leaves live cells in its wake.49854986\index{glider gun}4987\index{puffer train}49884989It turns out that both of these patterns exist. A team led4990by Bill Gosper discovered the first, a glider gun now called4991Gosper's Gun, which is shown in Figure~\ref{chap06-5}.4992Gosper also discovered the first puffer train.49934994\index{Gosper, Bill}49954996There are many patterns of both types, but they are not easy to4997design or find. That is not a coincidence. Conway chose the4998rules of GoL so that his conjecture would not be obviously4999true or false. Of all possible rules for a 2-D CA, most5000yield simple behavior: most initial conditions stabilize quickly5001or grow unboundedly. By avoiding uninteresting CAs, Conway5002was also avoiding Wolfram's Class 1 and Class 2 behavior, and5003probably Class 3 as well.50045005If we believe Wolfram's Principle of Computational Equivalence, we5006expect GoL to be in Class 4, and it is. The Game of Life was proved5007Turing complete in 1982 (and again, independently, in 1983).5008Since then, several people have constructed GoL patterns that implement5009a Turing machine or another machine known to be Turing complete.50105011\index{Class 4 behavior}5012\index{Turing complete}5013\index{universality}501450155016\section{Realism}50175018Stable patterns in GoL are hard not to notice, especially the ones5019that move. It is natural to think of them as persistent entities, but5020remember that a CA is made of cells; there is no such thing as a toad5021or a loaf. Gliders and other spaceships are even less real because5022they are not even made up of the same cells over time. So these5023patterns are like constellations of stars. We perceive them because5024we are good at seeing patterns, or because we have active5025imaginations, but they are not real.50265027\index{realism}50285029Right?50305031Well, not so fast. Many entities that we consider ``real'' are also5032persistent patterns of entities at a smaller scale. Hurricanes are5033just patterns of air flow, but we give them personal names. And5034people, like gliders, are not made up of the same cells over time.50355036\index{hurricane}50375038This is not a new observation --- about 2500 years ago Heraclitus5039pointed out that you can't step in the same river twice --- but the5040entities that appear in the Game of Life are a useful test case for5041thinking about scientific realism.50425043\index{Heraclitus}5044\index{scientific realism}50455046{\bf Scientific realism} pertains to scientific theories and the5047entities they postulate.5048A theory postulates an entity if it is5049expressed in terms of the properties and behavior of the entity.5050For example, theories about electromagnetism are expressed in5051terms of electrical and magnetic fields. Some theories about economics5052are expressed in terms of supply, demand, and market forces.5053And theories about biology are expressed in terms of genes.50545055But are these entities real? That is, do they exist in the world5056independent of us and our theories?50575058\index{gene}5059\index{postulated entity}5060\index{theory}50615062Again, I find it useful to state philosophical positions in a range of5063strengths; here are four statements of scientific realism with increasing5064strength:50655066\begin{description}50675068\item[SR1:] Scientific theories are true or false to the degree that5069they approximate reality, but no theory is exactly true. Some5070postulated entities may be real, but there is no principled way to5071say which ones.50725073\item[SR2:] As science advances, our theories become better5074approximations of reality. At least some postulated entities are5075known to be real.50765077\item[SR3:] Some theories are exactly true; others are approximately5078true. Entities postulated by true theories, and some entities5079in approximate theories, are real.50805081\item[SR4:] A theory is true if it describes reality correctly, and5082false otherwise. The entities postulated by true theories are real;5083others are not.50845085\end{description}50865087SR4 is so strong that it is probably untenable; by such a strict5088criterion, almost all current theories are known to be false.5089Most realists would accept something in the range5090between SR1 and SR3.509150925093\section{Instrumentalism}50945095But SR1 is so weak that it verges on {\bf instrumentalism}, which is5096the view that theories are instruments that we use for our purposes: a theory is useful, or not, to the degree that it is fit for its purpose, but we can't say whether it is true or false.50975098\index{instrumentalism}50995100To see whether you are comfortable with instrumentalism, I made up the following test. Read the following statements and give yourself a point for each one you agree with. If you score 4 or more, you might be an instrumentalist!510151025103\begin{quote}5104``Entities in the Game of Life aren't real; they are just patterns of5105cells that people have given cute names.''5106\end{quote}51075108\begin{quote}5109``A hurricane is just a pattern of air flow, but it is a useful5110description because it allows us to make predictions and communicate5111about the weather.''5112\end{quote}51135114\index{hurricane}51155116\begin{quote}5117``Freudian entities like the Id and the Superego aren't real, but they5118are useful tools for thinking and communicating about psychology (or5119at least some people think so).''5120\end{quote}51215122\index{Id}5123\index{Freud, Sigmund}5124\index{Superego}51255126\begin{quote}5127``Electric and magnetic fields are postulated entities in our best5128theory of electromagnetism, but they aren't real. We could5129construct other theories, without postulating fields, that would be5130just as useful.''5131\end{quote}51325133\index{electron}51345135\begin{quote}5136``Many of the things in the world that we identify as objects are5137arbitrary collections like constellations. For example, a mushroom5138is just the fruiting body of a fungus, most of which grows5139underground as a barely-contiguous network of cells. We focus5140on mushrooms for practical reasons like visibility and edibility.''5141\end{quote}51425143\index{mushroom}51445145\begin{quote}5146``Some objects have sharp boundaries, but many are fuzzy. For5147example, which molecules are part of your body: Air in your lungs?5148Food in your stomach? Nutrients in your blood? Nutrients in a5149cell? Water in a cell? Structural parts of a cell? Hair? Dead5150skin? Dirt? Bacteria on your skin? Bacteria in your gut?5151Mitochondria? How many of those molecules do you include when you5152weigh yourself? Conceiving the world in terms of discrete objects5153is useful, but the entities we identify are not real.''5154\end{quote}51555156If you are more comfortable with some of these statements than5157others, ask yourself why. What are the differences in these5158scenarios that influence your reaction? Can you make5159a principled distinction between them?51605161For more on instrumentalism, see5162\url{https://thinkcomplex.com/instr}.5163516451655166\section{Implementing Life}5167\label{implife}51685169The exercises at the end of this chapter ask you to experiment5170with and modify the Game of Life, and implement other 2-D cellular5171automatons. This section explains my implementation of GoL, which5172you can use as a starting place for your experiments.51735174\index{cellular automaton}5175\index{NumPy}51765177To represent the state of the cells, I use a NumPy array of 8-bit unsigned integers. As an example, the5178following line creates a 10 by 10 array initialized with random5179values of 0 and 1.51805181\index{dtype}5182\index{uint8}51835184\begin{code}5185a = np.random.randint(2, size=(10, 10), dtype=np.uint8)5186\end{code}51875188\index{NumPy}5189\index{random}5190\index{randint}51915192There are a few ways we can compute the GoL rules. The simplest5193is to use \py{for} loops to iterate through the rows and columns of5194the array:51955196\begin{code}5197b = np.zeros_like(a)5198rows, cols = a.shape5199for i in range(1, rows-1):5200for j in range(1, cols-1):5201state = a[i, j]5202neighbors = a[i-1:i+2, j-1:j+2]5203k = np.sum(neighbors) - state5204if state:5205if k==2 or k==3:5206b[i, j] = 15207else:5208if k == 3:5209b[i, j] = 15210\end{code}52115212Initially, \py{b} is an array of zeros with the same size as \py{a}.5213Each time through the loop, \py{state} is the condition of the center5214cell and \py{neighbors} is the 3x3 neighborhood. \py{k} is the number5215of live neighbors (not including the center cell). The nested \py{if}5216statements evaluate the GoL rules and turn on cells in \py{b}5217accordingly.52185219\index{neighborhood}5220\index{cross-correlation}5221\index{SciPy}5222\index{correlate2d}52235224This implementation is a straightforward translation of the rules, but5225it is verbose and slow. We can do better using cross-correlation, as5226we saw in Section~\ref{cross-correlation}. There, we used5227\py{np.correlate} to compute a 1-D correlation. Now, to perform 2-D5228correlation, we'll use \py{correlate2d} from \py{scipy.signal}, a5229SciPy module that provides functions related to signal processing:52305231\begin{code}5232from scipy.signal import correlate2d52335234kernel = np.array([[1, 1, 1],5235[1, 0, 1],5236[1, 1, 1]])52375238c = correlate2d(a, kernel, mode='same')5239\end{code}52405241What we called a ``window'' in the context of 1-D correlation is5242called a ``kernel'' in the context of 2-D correlation, but the idea5243is the same: \py{correlate2d} multiplies the kernel and the array to5244select a neighborhood, then adds up the result. This kernel selects5245the 8 neighbors that surround the center cell.52465247\index{window}5248\index{kernel}52495250\py{correlate2d} applies the kernel to each location in the array. With5251\py{mode='same'}, the result has the same size as \py{a}.52525253Now we can use logical operators to compute the rules:52545255\begin{code}5256b = (c==3) | (c==2) & a5257b = b.astype(np.uint8)5258\end{code}52595260The first line computes a boolean array with \py{True} where there5261should be a live cell and \py{False} elsewhere. Then \py{astype}5262converts the boolean array to an array of integers.52635264\index{boolean array}52655266This version is faster, and probably good enough,5267but we can simplify it slightly by modifying the5268kernel:52695270\begin{code}5271kernel = np.array([[1, 1, 1],5272[1,10, 1],5273[1, 1, 1]])52745275c = correlate2d(a, kernel, mode='same')5276b = (c==3) | (c==12) | (c==13)5277b = b.astype(np.uint8)5278\end{code}52795280This version of the kernel includes the center cell and gives it a5281weight of 10. If the center cell is 0, the result is between 0 and 8;5282if the center cell is 1, the result is between 10 and 18.5283Using this kernel, we can simplify the logical operations, selecting5284only cells with the values 3, 12, and 13.52855286That might not seem like a big improvement, but it allows one more5287simplification: with this kernel, we can use a table to look up5288cell values, as we did in Section~\ref{tables}.52895290\begin{code}5291table = np.zeros(20, dtype=np.uint8)5292table[[3, 12, 13]] = 15293c = correlate2d(a, kernel, mode='same')5294b = table[c]5295\end{code}52965297\py{table} has zeros everywhere except locations 3, 12, and 13. When5298we use \py{c} as an index into \py{table}, NumPy performs element-wise5299lookup; that is, it takes each value from \py{c}, looks it up in5300\py{table}, and puts the result into \py{b}.53015302This version is faster and more concise than the others; the only5303drawback is that it takes more explaining.53045305\py{Life.py}, which is included in the repository for this book,5306provides a \py{Life} class that encapsulates this implementation of5307the rules. If you run \py{Life.py}, you should see an animation5308of a ``puffer train'', which is a spaceship that leaves a trail of5309detritus in its wake.53105311\index{puffer train}531253135314\section{Exercises}53155316The code for this chapter is in the Jupyter notebook {\tt chap06.ipynb}5317in the repository for this book. Open this notebook, read the code,5318and run the cells. You can use this notebook to work on the5319following exercises. My solutions are in {\tt chap06soln.ipynb}.532053215322\begin{exercise}53235324Start GoL in a random state and run it until it stabilizes.5325What stable patterns can you identify?53265327\end{exercise}532853295330\begin{exercise}53315332Many named patterns are available in portable file formats.5333Modify \py{Life.py} to parse one of these formats and initialize5334the grid.53355336\end{exercise}5337533853395340\begin{exercise}53415342One of the longest-lived small patterns is ``rabbits'', which starts with53439 live cells and takes 17,331 steps to stabilize. You can get the5344initial configuration in various formats from5345\url{https://thinkcomplex.com/rabbits}. Load this configuration5346and run it.53475348\end{exercise}534953505351\begin{exercise}53525353In my implementation, the \py{Life} class is based on a parent class5354called \py{Cell2D}, and the \py{LifeViewer} class is based on \py{Cell2DViewer}. You can use these base classes to implement other 2-D cellular automatons.53555356\index{Cell2D class}5357\index{Cell2DViewer class}5358\index{LifeViewer class}5359\index{Life class}53605361For example, one variation of GoL, called ``Highlife'', has the5362same rules as GoL, plus one additional rule: a dead cell with 65363neighbors comes to life.53645365\index{Highlife}53665367Write a class named \py{Highlife} that inherits from \py{Cell2D} and implements5368this version of the rules. Also write a class named \py{HighlifeViewer}5369that inherits from \py{Cell2DViewer} and try different ways5370to visualize the results. As a simple example, use a different5371colormap.53725373One of the more interesting patterns in Highlife is the replicator (see \url{https://thinkcomplex.com/repl}).5374Use \py{add_cells} to initialize Highlife with a replicator and see what it5375does.53765377\end{exercise}537853795380\begin{exercise}53815382If you generalize the Turing machine to two dimensions, or5383add a read-write head to a 2-D CA, the result is a5384cellular automaton called a Turmite. It is named after a5385termite because of the way the read-write head moves, but5386spelled wrong as an homage to Alan Turing.53875388\index{turmite}5389\index{Turing, Alan}5390\index{Turing machine}53915392The most famous Turmite is Langton's Ant, discovered by Chris Langton5393in 1986. See \url{https://thinkcomplex.com/langton}.53945395\index{Langton's Ant}5396\index{Langton, Chris}53975398The ant is a read-write head with5399four states, which you can think of as facing north, south,5400east or west. The cells have two states, black and white.54015402\index{read-write head}54035404The rules are simple. During each time step, the ant checks the color5405of the cell it is on. If black, the ant turns to the right,5406changes the cell to white, and moves forward one space. If the cell5407is white, the ant turns left, changes the cell to black, and moves5408forward.54095410\index{simple rules}54115412Given a simple world, a simple set of rules, and only one moving part,5413you might expect to see simple behavior --- but you should know5414better by now. Starting with all white cells, Langton's ant5415moves in a seemingly random pattern for more than 10,000 steps5416before it enters a cycle with a period of 104 steps. After5417each cycle, the ant is translated diagonally, so it leaves5418a trail called the ``highway''.54195420\index{complex behavior}5421\index{period}54225423Write an implementation of Langton's Ant.54245425\end{exercise}54265427542854295430\chapter{Physical modeling}5431\label{modeling}54325433The cellular automatons we have seen so far are not physical models;5434that is, they are not intended to describe systems in the real world.5435But some CAs are intended as physical models.54365437In this chapter we consider a CA that models chemicals that diffuse (spread5438out) and react with each other, which is a process Alan Turing proposed5439to explain how some animal patterns develop.54405441And we'll experiment with a CA that models percolation of liquid5442through porous material, like water through coffee grounds. This5443model is the first of several models that exhibit {\bf phase change}5444behavior and {\bf fractal geometry}, and I'll explain what both of5445those mean.54465447\index{percolation}54485449The code for this chapter is in {\tt chap07.ipynb} in the repository5450for this book. More information about working with the code is5451in Section~\ref{code}.545254535454\section{Diffusion}54555456In 1952 Alan Turing published a paper called ``The chemical basis5457of morphogenesis'', which describes the behavior of systems involving5458two chemicals that diffuse in space and react with each other. He5459showed that these systems produce a wide range of patterns, depending5460on the diffusion and reaction rates, and conjectured that systems5461like this might be an important mechanism in biological growth processes,5462particularly the development of animal coloration patterns.54635464\index{Turing, Alan}5465\index{morphogenesis}54665467Turing's model is based on differential equations, but it can5468be implemented using a cellular automaton.54695470Before we get to Turing's model, we'll start with something simpler:5471a diffusion system with just one chemical. We'll use a 2-D CA where the5472state of each cell is a continuous quantity (usually between 0 and 1)5473that represents the concentration of the chemical.54745475\index{diffusion}54765477We'll model the diffusion process by comparing each cell with the5478average of its neighbors. If the concentration of the center cell5479exceeds the neighborhood average, the chemical flows from the center5480to the neighbors. If the concentration of the center cell is lower,5481the chemical flows the other way.54825483\index{correlate2d}5484\index{kernel}54855486The following kernel computes the difference between each cell5487and the average of its neighbors:54885489\begin{code}5490kernel = np.array([[0, 1, 0],5491[1,-4, 1],5492[0, 1, 0]])5493\end{code}54945495Using \py{np.correlate2d}, we can apply this kernel to each cell5496in an array:54975498\begin{code}5499c = correlate2d(array, kernel, mode='same')5500\end{code}55015502We'll use a diffusion constant, \py{r}, that relates the difference5503in concentration to the rate of flow:55045505\begin{code}5506array += r * c5507\end{code}55085509\begin{figure}5510\centerline{\includegraphics[height=2in]{figs/chap07-1.pdf}}5511\caption{A simple diffusion model after 0, 5, and 10 steps.}5512\label{chap07-1}5513\end{figure}55145515Figure~\ref{chap07-1} shows results for a CA with size \py{n=9}, diffusion constant \py{r=0.1}, and initial concentration 0 everywhere except for an ``island'' in the5516middle. The figure shows the starting configuration and the state of5517the CA after 5 and 10 steps. The chemical spreads from the center5518outward, continuing until the concentration is the same everywhere.551955205521\section{Reaction-diffusion}55225523Now let's add a second chemical. I'll define a new object,5524\py{ReactionDiffusion}, that contains two arrays, one for each5525chemical:55265527\index{reaction-diffusion}55285529\begin{code}5530class ReactionDiffusion(Cell2D):55315532def __init__(self, n, m, params, noise=0.1):5533self.params = params5534self.array = np.ones((n, m), dtype=float)5535self.array2 = noise * np.random.random((n, m))5536add_island(self.array2)5537\end{code}55385539\py{n} and \py{m} are the number of rows and columns in the array.5540\py{params} is a tuple of parameters, which I explain below.55415542\index{NumPy}5543\index{ones}5544\index{random}55455546\py{array} represents the concentration of the first chemical, \py{A};5547the NumPy function \py{ones} initializes it to 1 everywhere. The data type \py{float} indicates that the elements of \py{A} are floating-point values.55485549\index{dtype}5550\index{float}55515552\py{array2} represents the concentration of \py{B}, which is initialized with random values between 0 and \py{noise}, which is 0.1 by default. Then \py{add_island} adds an island of higher concentration in the middle:55535554\begin{code}5555def add_island(a, height=0.1):5556n, m = a.shape5557radius = min(n, m) // 205558i = n//25559j = m//25560a[i-radius:i+radius, j-radius:j+radius] += height5561\end{code}55625563\index{NumPy}5564\index{slice}55655566The radius of the island is one twentieth of \py{n} or \py{m},5567whichever is smaller. The height of the island is \py{height}, with the default value 0.1.55685569\index{island}55705571Here is the \py{step} function that updates the arrays:55725573\begin{code}5574def step(self):5575A = self.array5576B = self.array25577ra, rb, f, k = self.params55785579cA = correlate2d(A, self.kernel, **self.options)5580cB = correlate2d(B, self.kernel, **self.options)55815582reaction = A * B**25583self.array += ra * cA - reaction + f * (1-A)5584self.array2 += rb * cB + reaction - (f+k) * B5585\end{code}55865587The parameters are55885589\begin{description}55905591\item[\py{ra}:] The diffusion rate of \py{A} (analogous to \py{r} in the5592previous section).55935594\item[\py{rb}:] The diffusion rate of \py{B}. In most versions of5595this model, \py{rb} is about half of \py{ra}.55965597\item[\py{f}:] The ``feed'' rate, which controls how quickly \py{A} is5598added to the system.55995600\item[\py{k}:] The ``kill'' rate, which controls how quickly \py{B} is5601removed from the system.56025603\end{description}56045605Now let's look more closely at the update statements:56065607\begin{code}5608reaction = A * B**25609self.array += ra * cA - reaction + f * (1-A)5610self.array2 += rb * cB + reaction - (f+k) * B5611\end{code}56125613The arrays \py{cA} and \py{cB} are the result of applying a diffusion5614kernel to \py{A} and \py{B}. Multiplying by \py{ra} and \py{rb} yields5615the rate of diffusion into or out of each cell.56165617The term \py{A * B**2} represents the rate that \py{A} and \py{B}5618react with each other. Assuming that the reaction consumes \py{A} and5619produces \py{B}, we subtract this term in the first equation and add5620it in the second.56215622The term \py{f * (1-A)} determines the rate that \py{A} is added to5623the system. Where \py{A} is near 0, the maximum feed rate is \py{f}.5624Where \py{A} approaches 1, the feed rate drops off to zero.56255626Finally, the term \py{(f+k) * B} determines the rate that \py{B} is5627removed from the system. As \py{B} approaches 0, this rate goes5628to zero.56295630As long as the rate parameters are not too high, the values of5631\py{A} and \py{B} usually stay between 0 and 1.56325633\begin{figure}5634\centerline{\includegraphics[height=2in]{figs/chap07-2.pdf}}5635\caption{Reaction-diffusion model with parameters \py{f=0.035} and5636\py{k=0.057} after 1000, 2000, and 4000 steps.}5637\label{chap07-2}5638\end{figure}56395640With different parameters, this model can produce patterns similar5641to the stripes and spots on a variety of animals. In some cases,5642the similarity is striking, especially when the feed and5643kill parameters vary in space.56445645\index{animal pattern}56465647For all simulations in this section, \py{ra=0.5} and \py{rb=0.25}.56485649Figure~\ref{chap07-2} shows results with \py{f=0.035} and5650\py{k=0.057}, with the concentration of \py{B} shown in darker colors.5651With these parameters, the system evolves toward a stable configuration5652with light spots of \py{A} on a dark background of \py{B}.56535654\begin{figure}5655\centerline{\includegraphics[height=2in]{figs/chap07-3.pdf}}5656\caption{Reaction-diffusion model with parameters \py{f=0.055} and5657\py{k=0.062} after 1000, 2000, and 4000 steps.}5658\label{chap07-3}5659\end{figure}56605661Figure~\ref{chap07-3} shows results with \py{f=0.055} and5662\py{k=0.062}, which yields a coral-like pattern of \py{B} on a background5663of \py{A}.56645665\begin{figure}5666\centerline{\includegraphics[height=2in]{figs/chap07-4.pdf}}5667\caption{A reaction-diffusion model with parameters \py{f=0.039} and5668\py{k=0.065} after 1000, 2000, and 4000 steps.}5669\label{chap07-4}5670\end{figure}56715672Figure~\ref{chap07-4} shows results with \py{f=0.039} and5673\py{k=0.065}. These parameters produce spots of \py{B} that grow and5674divide in a process that resembles mitosis, ending with a stable pattern5675of equally-spaced spots.56765677Since 1952, observations and experiments have provided some support5678for Turing's conjecture. At this point it seems likely, but not yet5679proven, that many animal patterns are actually formed by5680reaction-diffusion processes of some kind.568156825683\section{Percolation}56845685Percolation is a process in which a fluid flows through a semi-porous5686material. Examples include oil in rock formations, water in paper,5687and hydrogen gas in micropores. Percolation models are also used to5688study systems that are not literally percolation, including epidemics5689and networks of electrical resistors. See5690\url{https://thinkcomplex.com/perc}.56915692\index{percolation}56935694Percolation models are often represented using random graphs like the5695ones we saw in Chapter~\ref{graphs}, but they can also be represented5696using cellular automatons. In the next few sections we'll explore5697a 2-D CA that simulates percolation.56985699In this model:57005701\begin{itemize}57025703\item Initially, each cell is either ``porous'' with probability \py{q} or5704``non-porous'' with probability \py{1-q}.57055706\item When the simulation begins, all cells are considered ``dry'' except the top row, which is ``wet''.57075708\item During each time step, if a porous cell has at least one wet neighbor,5709it becomes wet. Non-porous cells stay dry.57105711\item The simulation runs until it reaches a ``fixed point'' where no5712more cells change state.57135714\end{itemize}57155716If there is a path of wet cells from the top to the bottom row, we say5717that the CA has a ``percolating cluster''.57185719\index{cluster}5720\index{percolating cluster}57215722Two questions of interest regarding percolation are (1) the probability that a random array contains a percolating cluster, and (2) how that probability depends on \py{q}. These questions might remind you of Section~\ref{randomgraphs}, where we considered the probability that a random \Erdos-\Renyi~graph is connected. We will see several connections between that model and this one.57235724I define a new class to represent a percolation model:57255726\begin{code}5727class Percolation(Cell2D):57285729def __init__(self, n, q):5730self.q = q5731self.array = np.random.choice([1, 0], (n, n), p=[q, 1-q])5732self.array[0] = 55733\end{code}57345735\index{NumPy}5736\index{random}5737\index{choice}57385739\py{n} and \py{m} are the number of rows and columns in the CA.57405741The state of the CA is stored in \py{array}, which is initialized5742using \py{np.random.choice} to choose 1 (porous) with probability5743\py{q}, and 0 (non-porous) with probability \py{1-q}.57445745The state of the top row is set to 5, which represents a wet cell. Using 5, rather than the more obvious 2, makes it possible to use \py{correlate2d} to check whether any porous cell has a wet neighbor. Here is the kernel:57465747\begin{code}5748kernel = np.array([[0, 1, 0],5749[1, 0, 1],5750[0, 1, 0]])5751\end{code}57525753This kernel defines a 4-cell ``von Neumann'' neighborhood; unlike the Moore neighborhood we saw in Section~\ref{life}, it does not include the diagonals.57545755\index{neighborhood}5756\index{kernel}57575758This kernel adds up the states of the neighbors. If any of them are wet, the result will exceed 5. Otherwise the maximum result is 4 (if all neighbors happen to be porous).57595760We can use this logic to write a simple, fast \py{step} function:57615762\begin{code}5763def step(self):5764a = self.array5765c = correlate2d(a, self.kernel, mode='same')5766self.array[(a==1) & (c>=5)] = 55767\end{code}57685769This function identifies porous cells, where \py{a==1}, that have at least one wet neighbor, where \py{c>=5}, and sets their state to 5, which indicates that they are wet.57705771\begin{figure}5772\centerline{\includegraphics[height=2in]{figs/chap07-5.pdf}}5773\caption{The first three steps of a percolation model with \py{n=10} and5774\py{p=0.7}.}5775\label{chap07-5}5776\end{figure}57775778Figure~\ref{chap07-5} shows the first few steps of a percolation model5779with \py{n=10} and \py{p=0.7}. Non-porous cells are white, porous cells5780are lightly shaded, and wet cells are dark.578157825783\section{Phase change}57845785Now let's test whether a random array contains a percolating cluster:57865787\begin{code}5788def test_perc(perc):5789num_wet = perc.num_wet()57905791while True:5792perc.step()57935794if perc.bottom_row_wet():5795return True57965797new_num_wet = perc.num_wet()5798if new_num_wet == num_wet:5799return False58005801num_wet = new_num_wet5802\end{code}58035804\py{test_perc} takes a \py{Percolation} object as a parameter. Each5805time through the loop, it advances the CA one time step. It5806checks the bottom row to see if any cells are wet; if so,5807it returns \py{True}, to indicate that there is a percolating cluster.58085809During each time step, it also computes the number of wet cells and5810checks whether the number increased since the last step. If not, we5811have reached a fixed point without finding a percolating cluster, so5812\py{test_perc} returns \py{False}.58135814To estimate the probability of a percolating cluster, we generate5815many random arrays and test them:58165817\begin{code}5818def estimate_prob_percolating(n=100, q=0.5, iters=100):5819t = [test_perc(Percolation(n, q)) for i in range(iters)]5820return np.mean(t)5821\end{code}58225823\py{estimate_prob_percolating} makes 100 \py{Percolation} objects with the given values of \py{n} and \py{q} and calls \py{test_perc} to see how many of them have a percolating cluster. The return value is the fraction that do.58245825\index{NumPy}5826\index{mean}5827\index{list comprehension}58285829When \py{p=0.55}, the probability of a percolating cluster is near 0.5830At \py{p=0.60}, it is about 70\%, and at \py{p=0.65} it is near 1.5831This rapid transition suggests that there is a critical value of5832\py{p} near 0.6.58335834\index{random walk}5835\index{critical value}58365837We can estimate the critical value more precisely using a {\bf random walk}. Starting from an initial value of \py{q}, we construct a \py{Percolation} object and check whether it has a percolating cluster. If so, \py{q} is probably too high, so we decrease it. If not, \py{q} is probably too low, so we increase it.58385839Here's the code:58405841\begin{code}5842def find_critical(n=100, q=0.6, iters=100):5843qs = [q]5844for i in range(iters):5845perc = Percolation(n, q)5846if test_perc(perc):5847q -= 0.0055848else:5849q += 0.0055850qs.append(q)5851return qs5852\end{code}58535854The result is a list of values for \py{q}. We can estimate the critical value, \py{q_crit}, by computing the mean of this list. With \py{n=100} the5855mean of \py{qs} is about 0.59; this value does not seem to depend on \py{n}.58565857\index{phase change}58585859The rapid change in behavior near the critical value is called a {\bf phase change} by analogy with phase changes in physical systems, like the way water changes from liquid to solid at its freezing point.58605861\index{critical phenomena}5862\index{criticality}58635864A wide variety of systems display a common set of behaviors and characteristics when they are at or near a critical point. These behaviors are known collectively as {\bf critical phenomena}. In the next section, we explore one of them: fractal geometry.586558665867\section{Fractals}5868\label{fractals}58695870To understand fractals, we have to start with dimensions.58715872\index{fractal}5873\index{dimension}58745875For simple geometric objects, dimension is defined in terms of scaling5876behavior. For example, if the side of a square has length $l$, its5877area is $l^2$. The exponent, 2, indicates that a square is5878two-dimensional. Similarly, if the side of a cube has length $l$, its5879volume is $l^3$, which indicates that a cube is three-dimensional.58805881%TODO: These exponents don't look good in the O'Reilly edition.58825883More generally, we can estimate the dimension of an object by5884measuring some kind of size (like area or volume) as a function of some kind of linear measure (like the length of a side).58855886As an example, I'll estimate the dimension of a 1-D cellular5887automaton by measuring its area (total number of ``on'' cells)5888as a function of the number of rows.588958905891\begin{figure}5892\centerline{\includegraphics[height=2in]{figs/chap07-7.pdf}}5893\caption{One-dimensional CAs with rules 20, 50, and 18, after 32 time steps.}5894\label{chap07-7}5895\end{figure}58965897Figure~\ref{chap07-7} shows three 1-D CAs like the ones we saw5898in Section~\ref{onedim}. Rule 20 (left) generates5899a set of cells that seems like a line, so we expect it to be one-dimensional. Rule 50 (center) produces something like a triangle, so5900we expect it to be 2-D. Rule 18 (right) also produces something like a5901triangle, but the density is not uniform, so its scaling behavior is5902not obvious.59035904\index{scaling}59055906I'll estimate the dimension of these CAs with the following function,5907which counts the number of on cells after each time step.5908It returns a list of tuples, where each tuple contains $i$,5909$i^2$, and the total number of cells.59105911\begin{code}5912def count_cells(rule, n=500):5913ca = Cell1D(rule, n)5914ca.start_single()59155916res = []5917for i in range(1, n):5918cells = np.sum(ca.array)5919res.append((i, i**2, cells))5920ca.step()59215922return res5923\end{code}59245925\begin{figure}5926\centerline{\includegraphics[height=2in]{figs/chap07-8.pdf}}5927\caption{Number of ``on'' cells versus number of time steps for5928rules 20, 50, and 18.}5929\label{chap07-8}5930\end{figure}59315932Figure~\ref{chap07-8} shows the results plotted on a log-log scale.59335934In each figure, the top dashed line shows $y = i^2$. Taking5935the log of both sides, we have $\log y = 2 \log i$. Since the5936figure is on a log-log scale, the slope of this line is 2.59375938Similarly, the bottom dashed line shows $y = i$. On a log-log5939scale, the slope of this line is 1.59405941%TODO: The math in the following paragraph looks bad.59425943Rule 20 (left) produces 3 cells every 2 time steps, so the total number5944of cells after $i$ steps is $y = 1.5 i$. Taking the log of both5945sides, we have $\log y = \log 1.5 + \log i$, so on a log-log5946scale, we expect a line with slope 1. In fact, the estimated5947slope of the line is 1.01.59485949Rule 50 (center) produces $i+1$ new cells during the $i$th time step,5950so the total number of cells after $i$ steps is $y = i^2 + i$. If we5951ignore the second term and take the log of both sides, we have $\log y5952\sim 2 \log i$, so as $i$ gets large, we expect to see a line with5953slope 2. In fact, the estimated slope is 1.97.59545955\index{Rule 20}5956\index{Rule 50}5957\index{Rule 18}5958\index{fractal dimension}59595960Finally, for Rule 18 (right), the estimated slope is about 1.57, which5961is clearly not 1, 2, or any other integer. This suggests that the5962pattern generated by Rule 18 has a ``fractional dimension''; that is,5963it is a fractal.59645965This way of estimating a fractal dimension is called {\bf box-counting}. For more about it, see \url{https://thinkcomplex.com/box}.596659675968\section{Fractals and Percolation Models}5969\label{fracperc}59705971%TODO: The typesetting in this caption is broken.59725973\begin{figure}5974\centerline{\includegraphics[height=2in]{figs/chap07-6.pdf}}5975\caption{Percolation models with \py{q=0.6} and \py{n=100},5976\py{200}, and \py{300}.}5977\label{chap07-6}5978\end{figure}59795980Now let's get back to percolation models. Figure~\ref{chap07-6} shows5981clusters of wet cells in percolation simulations with \py{p=0.6} and5982\py{n=100}, \py{200}, and \py{300}. Informally, they resemble fractal5983patterns seen in nature and in mathematical models.59845985\index{percolation}59865987To estimate their fractal dimension, we can run CAs with5988a range of sizes, count the number of wet cells in each percolating5989cluster, and then see how the cell counts scale as we increase the5990size of the array.59915992The following loop runs the simulations:59935994\begin{code}5995res = []5996for size in sizes:5997perc = Percolation(size, q)5998if test_perc(perc):5999num_filled = perc.num_wet() - size6000res.append((size, size**2, num_filled))6001\end{code}60026003The result is a list of tuples where each tuple contains \py{size},6004\py{size**2}, and the number of cells in the6005percolating cluster (not including the initial wet cells in the top6006row).60076008\begin{figure}6009\centerline{\includegraphics[height=2in]{figs/chap07-9.pdf}}6010\caption{Number of cells in the percolating cluster versus CA size.}6011\label{chap07-9}6012\end{figure}60136014Figure~\ref{chap07-9} shows the results for a range of sizes from 106015to 100. The dots show the number of cells in each percolating6016cluster. The slope of a line fitted to these dots is often near 1.85,6017which suggests that the percolating cluster is, in fact, fractal when6018\py{q} is near the critical value.60196020\index{critical value}60216022When \py{q} is larger than the critical value, nearly every porous6023cell gets filled, so the number of wet cells is close to \py{q * size^2},6024which has dimension 2.60256026When \py{q} is substantially smaller than the critical value, the number6027of wet cells is proportional to the linear size of the array, so it has6028dimension 1.60296030% TODO: Check this chapter for leftover p's that should be q's60316032\section{Exercises}603360346035\begin{exercise}60366037In Section~\ref{fracperc} we showed that the Rule 18 CA produces a fractal.6038Can you find other 1-D CAs that produce fractals?60396040\index{Cell1D}60416042Note: the \py{Cell1D} object does not wrap around6043from the left edge to the right, which creates artifacts at the6044boundaries for some rules. You might want to use \py{Wrap1D}, which6045is a child class of \py{Cell1D} that wraps around. It is defined6046in \py{Cell1D.py} in the repository for this book.60476048\end{exercise}604960506051\begin{exercise}60526053In 1990 Bak, Chen and Tang proposed a cellular automaton that is6054an abstract model of a forest fire. Each cell is in one of three6055states: empty, occupied by a tree, or on fire.60566057\index{Bak, Per}6058\index{forest fire model}60596060The rules of the CA are:60616062\begin{enumerate}60636064\item An empty cell becomes occupied with probability $p$.60656066\item A cell with a tree burns if any of its neighbors6067is on fire.60686069\item A cell with a tree spontaneously burns, with6070probability $f$, even if none of its neighbors is on fire.60716072\item A cell with a burning tree becomes an empty cell in the next6073time step.60746075\end{enumerate}60766077Write a program that implements this model. You might want to inherit6078from \py{Cell2D}. Typical values for the parameters are $p=0.01$ and6079$f=0.001$, but you might want to experiment with other values.60806081Starting from a random initial condition, run the model until it reaches6082a steady state where the number of trees no longer increases or6083decreases consistently.60846085In steady state, is the geometry of the forest fractal?6086What is its fractal dimension?60876088\end{exercise}6089609060916092609360946095\chapter{Self-organized criticality}6096\label{soc}60976098In the previous chapter we saw an example of a system with a critical6099point and we explored one of the common properties of critical6100systems, fractal geometry.61016102\index{critical system}6103\index{heavy-tailed distribution}6104\index{pink noise}61056106%TODO: Check whether we really want to use Section titles, rather than6107% numbers, in cross references.61086109In this chapter, we explore two other properties of critical systems:6110heavy-tailed distributions, which we saw in Chapter~\ref{heavytail}6111and pink noise, which I'll explain in this chapter.61126113These properties are interesting in part because they appear6114frequently in nature; that is, many natural systems produce6115fractal-like geometry, heavy-tailed distributions, and pink noise.61166117This observation raises a natural question: why do so many natural6118systems have properties of critical systems? A possible answer is6119{\bf self-organized criticality} (SOC), which is the tendency of some6120systems to evolve toward, and stay in, a critical state.61216122\index{self-organized criticality}6123\index{SOC}61246125In this chapter I'll present a {\bf sand pile model} that was the6126first system shown to exhibit SOC.61276128The code for this chapter is in {\tt chap08.ipynb} in the repository6129for this book. More information about working with the code is6130in Section~\ref{code}.613161326133\section{Critical Systems}61346135Many critical systems demonstrate common behaviors:61366137\begin{itemize}61386139\item Fractal geometry: For example, freezing water tends to form6140fractal patterns, including snowflakes and other crystal6141structures. Fractals are characterized by self-similarity; that is,6142parts of the pattern are similar to scaled copies of the whole.61436144\index{fractal geometry}6145\index{self-similarity}61466147\item Heavy-tailed distributions of some physical quantities: For6148example, in freezing water the distribution of crystal sizes is6149characterized by a power law.61506151\index{long tail}6152\index{power law}61536154\item Variations in time that exhibit {\bf pink noise}: Complex6155signals can be decomposed into their frequency components. In pink6156noise, low-frequency components have more power than high-frequency6157components. Specifically, the power at frequency $f$ is6158proportional to $1/f$.61596160\index{pink noise}6161\index{1/f noise@$1/f$ noise}61626163\end{itemize}61646165Critical systems are usually unstable. For example, to keep water in6166a partially frozen state requires active control of the temperature.6167If the system is near the critical temperature, a small deviation6168tends to move the system into one phase or the other.61696170\index{unstable}61716172Many natural systems exhibit characteristic behaviors of criticality,6173but if critical points are unstable, they should not be common in6174nature. This is the puzzle Bak, Tang and Wiesenfeld address. Their6175solution is called self-organized criticality (SOC), where6176``self-organized'' means that from any initial condition, the system6177moves toward a critical state, and stays there, without6178external control.61796180\index{SOC}61816182\section{Sand Piles}61836184The sand pile model was proposed by Bak, Tang and Wiesenfeld in 1987.6185It is not meant to be a realistic model of a sand pile, but rather an6186abstraction that models physical systems with a large number of6187elements that interact with their neighbors.61886189\index{sand pile model}6190\index{abstract model}6191\index{grid}61926193The sand pile model is a 2-D cellular automaton where the state of6194each cell represents the slope of a part of a sand pile. During each6195time step, each cell is checked to see whether it exceeds a critical6196value, $K$, which is usually 3. If so, it ``topples'' and transfers6197sand to four neighboring cells; that is, the slope of the cell is6198decreased by 4, and each of the neighbors is increased by 1.6199At the perimeter of the grid, all cells are kept at slope 0, so the6200excess spills over the edge.62016202\index{cellular automaton}6203\index{state}62046205Bak, Tang and Wiesenfeld initialize all cells at a6206level greater than \py{K} and run the model until it stabilizes.6207Then they observe the effect of small perturbations: they choose a6208cell at random, increment its value by 1, and run the model6209again until it stabilizes.62106211For each perturbation, they measure \py{T}, the number of time steps6212the pile takes to stabilize, and \py{S}, the total number of cells6213that topple\footnote{The original paper uses a different definition6214of \py{S}, but most later work uses this definition.}.62156216%TODO: The typesetting of this footnote is broken.62176218\index{perturbation}62196220Most of the time, dropping a single grain causes no cells to topple,6221so \py{T=1} and \py{S=0}. But occasionally a single grain can cause an6222{\bf avalanche} that affects a substantial fraction of the grid. The6223distributions of \py{T} and \py{S} turn out to be heavy-tailed, which6224supports the claim that the system is in a critical state.62256226\index{avalanche}62276228They conclude that the sand pile model exhibits ``self-organized6229criticality'', which means that it evolves toward a critical state without the need for external control or6230what they call ``fine tuning'' of any parameters. And the model6231stays in a critical state as more grains are added.62326233In the next few sections I replicate their experiments and6234interpret the results.623562366237\section{Implementing the Sand Pile}62386239To implement the sand pile model, I define a class called \py{SandPile}6240that inherits from \py{Cell2D}, which is defined in \py{Cell2D.py} in6241the repository for this book.62426243\begin{code}6244class SandPile(Cell2D):62456246def __init__(self, n, m, level=9):6247self.array = np.ones((n, m)) * level6248\end{code}62496250All values in the array are initialized to \py{level}, which is6251generally greater than the toppling threshold, \py{K}.62526253\index{Cell2D}6254\index{SandPile}62556256Here's the \py{step} method that finds all cells above \py{K} and6257topples them:62586259\begin{code}6260kernel = np.array([[0, 1, 0],6261[1,-4, 1],6262[0, 1, 0]])62636264def step(self, K=3):6265toppling = self.array > K6266num_toppled = np.sum(toppling)6267c = correlate2d(toppling, self.kernel, mode='same')6268self.array += c6269return num_toppled6270\end{code}62716272To show how \py{step} works, I'll start with a small pile that has two cells ready to topple:62736274\begin{code}6275pile = SandPile(n=3, m=5, level=0)6276pile.array[1, 1] = 46277pile.array[1, 3] = 46278\end{code}62796280Initially, \py{pile.array} looks like this:62816282\begin{code}6283[[0 0 0 0 0]6284[0 4 0 4 0]6285[0 0 0 0 0]]6286\end{code}62876288Now we can select the cells that are above the toppling threshold:62896290\index{toppling threshold}62916292\begin{code}6293toppling = pile.array > K6294\end{code}62956296The result is a boolean array, but we can use it as if it were6297an array of integers like this:62986299\begin{code}6300[[0 0 0 0 0]6301[0 1 0 1 0]6302[0 0 0 0 0]]6303\end{code}63046305If we correlate this array with the kernel, it makes copies of the6306kernel at each location where \py{toppling} is 1.63076308\index{boolean array}63096310\begin{code}6311c = correlate2d(toppling, kernel, mode='same')6312\end{code}63136314And here's the result:63156316\begin{code}6317[[ 0 1 0 1 0]6318[ 1 -4 2 -4 1]6319[ 0 1 0 1 0]]6320\end{code}63216322Notice that where the copies of the kernel overlap, they add up.63236324This array contains the change for each cell, which we use6325to update the original array:63266327\begin{code}6328pile.array += c6329\end{code}63306331And here's the result.63326333\begin{code}6334[[0 1 0 1 0]6335[1 0 2 0 1]6336[0 1 0 1 0]]6337\end{code}63386339So that's how \py{step} works.63406341With \py{mode='same'}, \py{correlate2d} considers the boundary of the array6342to be fixed at zero, so any grains of sand that go over the edge6343disappear.63446345\py{SandPile} also provides \py{run}, which calls \py{step} until6346no more cells topple:63476348\begin{code}6349def run(self):6350total = 06351for i in itertools.count(1):6352num_toppled = self.step()6353total += num_toppled6354if num_toppled == 0:6355return i, total6356\end{code}63576358The return value is a tuple that contains the number of time6359steps and the total number of cells that toppled.63606361\index{itertools module}63626363If you are not familiar with \py{itertools.count}, it is an6364infinite generator that counts up from the given initial value,6365so the \py{for} loop runs until \py{step} returns 0.6366You can read about the \py{itertools} module at \url{https://thinkcomplex.com/iter}.63676368Finally, the \py{drop} method chooses a random cell and adds a6369grain of sand:63706371\begin{code}6372def drop(self):6373a = self.array6374n, m = a.shape6375index = np.random.randint(n), np.random.randint(m)6376a[index] += 16377\end{code}63786379\index{NumPy}6380\index{random}6381\index{randint}63826383Let's look at a bigger example, with \py{n=20}:63846385\begin{code}6386pile = SandPile(n=20, level=10)6387pile.run()6388\end{code}63896390\begin{figure}6391\centerline{\includegraphics[height=3in]{figs/chap08-1.pdf}}6392\caption{Sand pile model initial state (left), after 200 steps (middle), and 400 steps (right).}6393\label{chap08-1}6394\end{figure}63956396With an initial level of \py{10}, this sand pile takes 332 time steps to6397reach equilibrium, with a total of 53,336 topplings.6398Figure~\ref{chap08-1} (left) shows the configuration after this initial6399run. Notice that it has the repeating elements that6400are characteristic of fractals. We'll come back to that soon.64016402Figure~\ref{chap08-1} (middle) shows the configuration of the sand6403pile after dropping 200 grains onto random cells, each time running6404until the pile reaches equilibrium. The symmetry of the initial6405configuration has been broken; the configuration looks random.64066407Finally Figure~\ref{chap08-1} (right) shows the configuration6408after 400 drops. It looks similar to the configuration after 2006409drops. In fact, the pile is now in a steady state where its statistical6410properties don't change over time. I'll explain some of those6411statistical properties in the next section.641264136414\section{Heavy-tailed distributions}6415\label{heavysand}64166417If the sand pile model is in a critical state, we expect to find6418heavy-tailed distributions for quantities like the duration and size6419of avalanches. So let's take a look.64206421\index{heavy-tailed distribution}64226423I'll make a larger sand pile, with \py{n=50} and an initial level of6424\py{30}, and run until equilibrium:64256426\begin{code}6427pile2 = SandPile(n=50, level=30)6428pile2.run()6429\end{code}64306431Next, I'll run 100,000 random drops64326433\begin{code}6434iters = 1000006435res = [pile2.drop_and_run() for _ in range(iters)]6436\end{code}64376438As the name suggests, \py{drop_and_run} calls \py{drop} and \py{run} and6439returns the duration of the avalanche and total number of cells6440that toppled.64416442\index{NumPy}6443\index{transpose}64446445So \py{res} is a list of \py{(T, S)} tuples, where \py{T} is duration,6446in time steps, and \py{S} is cells toppled. We can use \py{np.transpose} to unpack \py{res} into two NumPy arrays:64476448\begin{code}6449T, S = np.transpose(res)6450\end{code}64516452A large majority of drops have duration 1 and no toppled cells; if we filter them out before plotting, we get a clearer view of the rest of the distribution.64536454\begin{code}6455T = T[T>1]6456S = S[S>0]6457\end{code}64586459The distributions of \py{T} and \py{S} have many small values and a few6460very large ones. I'll use the \py{Pmf} class from \py{thinkstats2} to6461make a PMF of the values, that is, a map from each value to6462its probability of occurring (see Section~\ref{degree}).64636464\index{thinkstats2 module}6465\index{Pmf}64666467\begin{code}6468pmfT = Pmf(T)6469pmfS = Pmf(S)6470\end{code}64716472\begin{figure}6473\centerline{\includegraphics[height=3in]{figs/chap08-2.pdf}}6474\caption{Distribution of avalanche duration (left) and size (right), linear scale.}6475\label{chap08-2}6476\end{figure}64776478\begin{figure}6479\centerline{\includegraphics[height=3in]{figs/chap08-3.pdf}}6480\caption{Distribution of avalanche duration (left) and size (right), log-log scale.}6481\label{chap08-3}6482\end{figure}64836484Figure~\ref{chap08-2} shows the results for values less than 50.64856486As we saw in Section~\ref{heavytail}, we can get a clearer picture of6487these distributions by plotting them on a log-log scale, as shown6488in Figure~\ref{chap08-3}.64896490For values between 1 and 100, the distributions are nearly straight6491on a log-log scale, which is characteristic of a heavy tail. The6492gray lines in the figure have slopes near -1, which suggests that these6493distributions follow a power law with parameters near $\alpha=1$.64946495\index{power law}64966497For values greater than 100, the distributions fall away more quickly6498than the power law model, which means there are fewer very large6499values than the model predicts. One possibility is that this6500effect is due to the finite size of the sand pile; if so, we might6501expect larger piles to fit the power law better.65026503Another possibility, which you can explore in one of the exercises at the end of this chapter, is that these distributions do not strictly obey a power6504law. But even if they are not power-law distributions, they might still be heavy-tailed.65056506%TODO: Explain more?650765086509\section{Fractals}6510\label{sandfrac}65116512Another property of critical systems is fractal geometry. The6513initial configuration in Figure~\ref{chap08-1} (left) resembles a fractal,6514but you can't always tell by looking.6515A more reliable way to identify a fractal is to estimate its6516fractal dimension, as we saw in Section~\ref{fractals}6517and Section~\ref{fracperc}.65186519\index{fractal}65206521I'll start by making a bigger sand pile, with \py{n=131} and initial6522level \py{22}.65236524\begin{code}6525pile3 = SandPile(n=131, level=22)6526pile3.run()6527\end{code}65286529It takes 28,379 steps for this pile to reach equilibrium,6530with more than 200 million cells toppled.65316532To see the resulting pattern more clearly, I select cells with6533levels 0, 1, 2, and 3, and plot them separately:65346535\begin{code}6536def draw_four(viewer, levels=range(4)):6537thinkplot.preplot(rows=2, cols=2)6538a = viewer.viewee.array65396540for i, level in enumerate(levels):6541thinkplot.subplot(i+1)6542viewer.draw_array(a==level, vmax=1)6543\end{code}65446545\py{draw_four} takes a \py{SandPileViewer} object, which is defined6546in \py{Sand.py} in the repository for this book. The parameter \py{levels} is the list of levels we want to plot; the default is the range 0 through 3. If the sand pile has run until equilibrium, these are the only levels that should exist.65476548Inside the loop, it uses \py{a==level} to make a boolean array that's \py{True} where the array is \py{level} and \py{False} otherwise. \py{draw_array} treats these booleans as 1s and 0s.65496550\begin{figure}6551\centerline{\includegraphics[height=3in]{figs/chap08-4.pdf}}6552\caption{Sand pile model in equilibrium, selecting cells with levels 0, 1, 2, and 3, left to right, top to bottom. }6553\label{chap08-4}6554\end{figure}65556556\index{box-counting algorithm}65576558Figure~\ref{chap08-4} shows the results for \py{pile3}. Visually, these patterns resemble fractals, but looks can be deceiving. To be more confident, we can estimate the fractal dimension for each pattern using {\bf box-counting}, as we saw in Section~\ref{fractals}.65596560We'll count the number of6561cells in a small box at the center of the pile, then see6562how the number of cells increases as the box gets bigger.6563Here's my implementation:65646565\begin{code}6566def count_cells(a):6567n, m = a.shape6568end = min(n, m)65696570res = []6571for i in range(1, end, 2):6572top = (n-i) // 26573left = (m-i) // 26574box = a[top:top+i, left:left+i]6575total = np.sum(box)6576res.append((i, i**2, total))65776578return np.transpose(res)6579\end{code}65806581The parameter, \py{a}, is a boolean array.6582The size of the box is initially 1. Each time through the6583loop, it increases by 2 until it reaches \py{end}, which is the6584smaller of \py{n} and \py{m}.65856586\index{boolean array}65876588Each time through the loop, \py{box} is a set of cells with width6589and height \py{i}, centered in the array. \py{total} is the number6590of ``on'' cells in the box.65916592\index{NumPy}6593\index{transpose}65946595The result is a list of tuples, where each tuple contains6596\py{i}, \py{i**2}, and the number of cells in the box.6597When we pass this result to \py{transpose}, NumPy converts it to6598an array with three columns, and then {\bf transposes} it; that is,6599it makes the columns into rows and the rows into columns. The result6600is an array with 3 rows: \py{i}, \py{i**2}, and \py{total}.66016602\index{fractal dimension}66036604Here's how we use \py{count_cells}:66056606\begin{code}6607res = count_cells(pile.array==level)6608steps, steps2, cells = res6609\end{code}66106611The first line creates a boolean array that contains \py{True}6612where the array equals \py{level}, calls6613\py{count_cells}, and gets an array with three rows.66146615The second line unpacks the rows and assigns them to \py{steps},6616\py{steps2}, and \py{cells}, which we can plot like this:66176618\begin{code}6619thinkplot.plot(steps, steps2, linestyle='dashed')6620thinkplot.plot(steps, cells)6621thinkplot.plot(steps, steps, linestyle='dashed')6622\end{code}66236624\begin{figure}6625\centerline{\includegraphics[height=4in]{figs/chap08-5.pdf}}6626\caption{Box counts for cells with levels 0, 1, 2, and 3, compared to dashed lines with slopes 1 and 2.}6627\label{chap08-5}6628\end{figure}66296630Figure~\ref{chap08-5} shows the results.6631On a log-log scale, the cell counts form nearly straight lines,6632which indicates that we are measuring fractal dimension over6633a valid range of box sizes.66346635\index{log-log scale}66366637To estimate the slopes of these lines, we can use the SciPy function6638\py{linregress}, which fits a line to the data by linear regression6639(see \url{https://thinkcomplex.com/regress}).66406641\index{SciPy}6642\index{linregress}66436644\begin{code}6645from scipy.stats import linregress66466647params = linregress(np.log(steps), np.log(cells))6648slope = params[0]6649\end{code}66506651The estimated fractal dimensions are:66526653\begin{code}66540 1.87166551 3.50266562 1.78166573 2.0846658\end{code}66596660The fractal dimension for levels 0, 1, and 2 seems to be clearly6661non-integer, which indicates that the image is fractal.66626663The estimate for level 3 is indistinguishable from66642, but given the results for the other values, the apparent curvature6665of the line, and the appearance of the pattern, it seems likely that6666it is also fractal.66676668\index{fractal dimension}66696670One of the exercises in the notebook for this chapter asks you to run6671this analysis again with different values of \py{n} and the initial \py{level} to see if the estimated dimensions are consistent.667266736674\section{Pink noise}66756676The title of the original paper that presented the sand pile model6677is ``Self-Organized Criticality: An Explanation of $1/f$ Noise''.6678You can read it at \url{https://thinkcomplex.com/bak}.66796680As the subtitle suggests, Bak, Tang and Wiesenfeld were trying to6681explain why many natural and engineered systems exhibit $1/f$6682noise, which is also known as ``flicker noise'' and ``pink noise''.66836684\index{pink noise}6685\index{flicker noise}6686\index{$1/f$ noise}6687\index{signal}6688\index{power spectrum}6689\index{noise}66906691To understand pink noise, we have to take a detour to understand6692signals, power spectrums, and noise.66936694\begin{description}66956696\item[Signal:] A {\bf signal} is any quantity that varies in time.6697One example is sound, which is variation in air density. In the sand pile model, the signals we'll consider are avalanche durations and sizes as they vary over time.66986699\item[Power spectrum:] Any signal can be decomposed into a set of frequency components with different levels of {\bf power}, which is related to amplitude or volume. The {\bf power spectrum} of a signal is a function that shows the power of each frequency component.67006701\item[Noise:] In common use, {\bf noise} is usually an unwanted sound,6702but in the context of signal processing, it is a signal that6703contains many frequency components.67046705\end{description}67066707There are many kinds of noise. For example, ``white noise'' is a6708signal that has components with equal power over a wide range of6709frequencies.67106711\index{white noise}6712\index{frequency}6713\index{power}6714\index{red noise}67156716Other kinds of noise have different relationships between frequency6717and power. In ``red noise'', the power at frequency $f$ is6718$1/f^2$, which we can write like this:6719%6720\[ P(f) = 1/f^2 \]6721%6722We can generalize this equation by replacing the exponent $2$ with6723a parameter $\beta$:6724%6725\[ P(f) = 1/f^\beta \]6726%6727When $\beta=0$, this equation describes white noise; when $\beta=2$ it6728describes red noise. When the parameter is near 1, the result is called6729$1/f$ noise. More generally, noise with any value between 0 and 26730is called ``pink'', because it's between white and red.67316732\index{logarithm}67336734We can use this relationship to derive a test for pink noise.6735Taking the log of both sides yields6736%6737\[ \log P(f) = -\beta \log f \]6738%6739So if we plot $P(f)$ versus $f$ on a log-log scale, we6740expect a straight line with slope $-\beta$.67416742What does this have to do with the sand pile model? Suppose that every time6743a cell topples, it makes a sound. If we record a sand pile model6744while its running, what would it sound like? In the next section, we'll simulate the sound of the sand pile model and see if it is pink noise.67456746\index{sand pile model}67476748\section{The sound of sand}67496750As my implementation of \py{SandPile} runs, it records the number of6751cells that topple during each time step, accumulating the results in6752a list called \py{toppled_seq}. After running the model6753in Section~\ref{heavysand}, we can extract the resulting signal:67546755\begin{code}6756signal = pile2.toppled_seq6757\end{code}67586759To compute the power spectrum of this signal we can use the SciPy function \py{welch}:67606761\begin{code}6762from scipy.signal import welch67636764nperseg = 20486765freqs, spectrum = welch(signal, nperseg=nperseg, fs=nperseg)6766\end{code}67676768\index{SciPy}6769\index{Welch's method}67706771This function uses Welch's method, which splits the signal into segments and computes the power spectrum of each segment. The result is typically noisy, so Welch's method averages across segments to estimate the average power at each frequency. For more about Welch's method, see \url{https://thinkcomplex.com/welch}.67726773The parameter \py{nperseg} specifies the number of time steps per segment. With longer segments, we can estimate the power for more frequencies. With shorter segments, we get better estimates for each frequency. The value I chose, 2048, balances these tradeoffs.67746775\index{sampling frequency}67766777The parameter \py{fs} is the ``sampling frequency'', which is the number of data points in the signal per unit of time. By setting \py{fs=nperseg},6778we get a range of frequencies from 0 to \py{nperseg/2}. This range is convenient, but because the units of time in the model are arbitrary,6779it doesn't mean much.67806781The return values, \py{freqs} and \py{powers}, are NumPy arrays containing6782the frequencies of the components and their corresponding powers, which we can6783plot. Figure~\ref{chap08-6} shows the result.67846785\begin{figure}6786\centerline{\includegraphics[height=3in]{figs/chap08-6.pdf}}6787\caption{Power spectrum of the number of toppled cells over time, log-log scale.}6788\label{chap08-6}6789\end{figure}67906791For frequencies between679210 and 1000 (in arbitrary units), the spectrum falls on a straight6793line, which is what we expect for pink or red noise.67946795The gray line in the figure has slope $-1.58$, which indicates that6796%6797\[ \log P(f) \sim -\beta \log f \]6798%6799with parameter $\beta=1.58$, which is the same parameter reported by Bak,6800Tang, and Wiesenfeld. This result confirms that the sand pile model generates pink noise.680168026803\section{Reductionism and Holism}6804\label{model2}68056806The original paper by Bak, Tang and Wiesenfeld is one of6807the most frequently-cited papers in the last few decades.6808Some subsequent papers have reported other systems that are apparently6809self-organized critical (SOC). Others have studied the sand pile model in6810more detail.68116812\index{sand pile model}68136814As it turns out, the sand pile model is not a good model of a6815sand pile. Sand is dense and not very sticky, so momentum has a6816non-negligible effect on the behavior of avalanches. As a result,6817there are fewer very large and very small avalanches than the model6818predicts, and the distribution might not be heavy-tailed.68196820Bak has suggested that this observation misses the point.6821The sand pile model is not meant to be a realistic model of a sand6822pile; it is meant to be a simple example of a broad category of6823models.68246825To understand this point, it is useful to think about two6826kinds of models, {\bf reductionist} and {\bf holistic}. A6827reductionist model describes a system by describing its parts6828and their interactions. When a reductionist model is used6829as an explanation, it depends on an analogy between the6830components of the model and the components of the system.68316832\index{reductionism}6833\index{holism}6834\index{ideal gas law}68356836For example, to explain why the ideal gas law holds, we can model the6837molecules that make up a gas with point masses and model their6838interactions as elastic collisions. If you simulate or analyze this6839model, you find that it obeys the ideal gas law. This model is6840satisfactory to the degree that molecules in a gas behave like6841molecules in the model. The analogy is between the parts of the6842system and the parts of the model.68436844\index{analogy}68456846\begin{figure}6847\centerline{\includegraphics[width=5in]{figs/model2.pdf}}6848\caption{The logical structure of a holistic model.\label{fig.model2}}6849\end{figure}68506851Holistic models are more focused on similarities between systems and6852less interested in analogous parts. A holistic approach to modeling6853consists of these steps:68546855\index{holistic model}68566857\begin{itemize}68586859\item Observe a behavior that appears in a variety of systems.68606861\item Find a simple model that demonstrates that behavior.68626863\item Identify the elements of the model that are necessary and6864sufficient to produce the behavior.68656866\end{itemize}68676868For example, in {\em The Selfish Gene}, Richard Dawkins suggests that6869genetic evolution is just one example of an evolutionary system. He6870identifies the essential elements of the category --- discrete6871replicators, variability, and differential reproduction --- and proposes6872that any system with these elements will show evidence of evolution.68736874\index{Selfish Gene@{\em The Selfish Gene}}6875\index{Dawkins, Richard}6876\index{evolution}68776878As another example of an evolutionary system, he proposes ``memes'',6879which are thoughts or behaviors that are replicated by transmission6880from person to person\footnote{This use of ``meme'' is original to6881Dawkins, and predates the distantly-related use of the word on the6882Internet by about 20 years.}. As memes compete for the resource of6883human attention, they evolve in ways that are similar to genetic6884evolution.68856886\index{meme}6887\index{replicator}68886889Critics of the meme model have pointed out that6890memes are a poor analogy for genes; they differ from genes in many6891obvious ways. Dawkins has argued that these differences are6892beside the point because memes are not {\em supposed} to be analogous6893to genes. Rather, memes and genes are examples of the same6894category: evolutionary systems. The differences between them6895emphasize the real point, which is that evolution is a general model6896that applies to many seemingly disparate systems. The logical6897structure of this argument is shown in Figure~\ref{fig.model2}.68986899\index{gene}6900\index{genetics}6901\index{self-organized criticality}69026903Bak has made a similar argument that self-organized criticality is a6904general model for a broad category of systems:69056906\begin{quote}6907Since these phenomena appear everywhere, they cannot depend on any6908specific detail whatsoever... If the physics of a large class of6909problems is the same, this gives [the theorist] the option of selecting6910the {\em simplest} possible [model] belonging to that class for detailed6911study.\footnote{Bak, {\em How Nature Works}, Springer-Verlag 1996, page 43.}6912\end{quote}69136914\index{Bak, Per}69156916Many natural systems demonstrate behaviors characteristic of critical6917systems. Bak's explanation for this prevalence is that these systems6918are examples of the broad category of self-organized criticality.6919There are two ways to support this argument. One is to build6920a realistic model of a particular system and show that the model6921exhibits SOC. The second is to show that SOC is a feature of many6922diverse models, and to identify the essential characteristics6923those models have in common.69246925\index{SOC}69266927The first approach, which I characterize as reductionist, can explain6928the behavior of a particular system. The second approach, which I am calling holistic, can explain the prevalence of criticality in natural systems. They are different models with different purposes.69296930\index{prevalence}69316932For reductionist models, realism is the primary virtue, and simplicity6933is secondary. For holistic models, it is the other way around.693469356936\section{SOC, causation, and prediction}69376938If a stock market index drops by a fraction of a percent in a6939day, there is no need for an explanation. But if it drops 10\%,6940people want to know why. Pundits6941on television are willing to offer explanations, but the real6942answer may be that there is no explanation.69436944\index{stock market}69456946Day-to-day variability in the stock market shows evidence of6947criticality: the distribution of value changes is heavy-tailed6948and the time series exhibits pink noise.6949If the stock market is a critical system, we6950should expect occasional large changes as part of the ordinary6951behavior of the market.69526953The distribution of earthquake sizes is also heavy-tailed,6954and there are simple models of the dynamics of geological faults6955that might explain this behavior. If these models are right,6956they imply that large earthquakes are not exceptional; that is,6957they do not require explanation any more than small earthquakes do.69586959\index{earthquake}6960\index{prediction}6961\index{causation}69626963Similarly, Charles Perrow has suggested that failures in large6964engineered systems, like nuclear power plants, are like avalanches6965in the sand pile model. Most failures are small, isolated, and6966harmless, but occasionally a coincidence of bad fortune yields a6967catastrophe. When big accidents occur, investigators go looking for6968the cause, but if Perrow's ``normal accident theory'' is correct,6969there may be no special cause of large failures.69706971\index{normal accident theory}6972\index{Perrow, Charles}69736974These conclusions are not comforting. Among other things, they6975imply that large earthquakes and some kinds of accidents are6976fundamentally unpredictable. It is impossible to look at the6977state of a critical system and say whether a large avalanche6978is ``due''. If the system is in a critical state, then a large6979avalanche is always possible. It just depends on the6980next grain of sand.69816982In a sand pile model, what is the cause of a large avalanche?6983Philosophers sometimes distinguish the {\bf proximate} cause, which is6984most immediately responsible, from the {\bf ultimate} cause, which is considered some deeper kind of explanation (see \url{https://thinkcomplex.com/cause}).69856986\index{proximate cause}6987\index{ultimate cause}69886989In the sand pile model, the proximate cause of an avalanche is6990a grain of sand, but the grain that causes a large avalanche6991is identical to every other grain, so it offers no special explanation.6992The ultimate cause of a large avalanche is the structure and6993dynamics of the systems as a whole: large avalanches occur because6994they are a property of the system.69956996\index{avalanche}69976998Many social phenomena, including wars, revolutions, epidemics,6999inventions, and terrorist attacks, are characterized by heavy-tailed7000distributions. If these distributions are prevalent because social systems are SOC, major historical events may be fundamentally unpredictable and unexplainable.70017002\index{heavy-tailed distribution}700370047005\section{Exercises}70067007The code for this chapter is in the Jupyter notebook {\tt chap08.ipynb}7008in the repository for this book. Open this notebook, read the code,7009and run the cells. You can use this notebook to work on the7010following exercises. My solutions are in {\tt chap08soln.ipynb}.70117012\begin{exercise}70137014To test whether the distributions of \py{T} and \py{S} are7015heavy-tailed, we plotted their PMFs on a log-log scale, which is7016what Bak, Tang and Wiesenfeld show in their paper. But as we saw in7017Section~\ref{cdf}, this visualization can obscure the shape of the7018distribution. Using the same data, make a plot that shows the7019cumulative distributions (CDFs) of \py{S} and \py{T}. What can you7020say about their shape? Do they follow a power law? Are they heavy-tailed?70217022\index{CDF}7023\index{cumulative distribution function}7024\index{log-log scale}70257026You might find it helpful to plot the CDFs on a log-x scale and on a7027log-log scale.70287029\end{exercise}70307031\begin{exercise}70327033In Section~\ref{sandfrac} we showed that the initial configuration of7034the sand pile model produces fractal patterns. But after we drop a7035large number of random grains, the patterns look more random.70367037\index{fractal}70387039Starting with the example in Section~\ref{sandfrac}, run the sand pile7040model for a while and then compute fractal dimensions for each of the70414 levels. Is the sand pile model fractal in steady state?70427043\end{exercise}70447045\begin{exercise}70467047Another version of the sand pile model, called the ``single source''7048model, starts from a different initial condition: instead of all cells7049at the same level, all cells are set to 0 except the center cell,7050which is set to a large value. Write a function that creates a7051\py{SandPile} object, sets up the single source initial condition, and7052runs until the pile reaches equilibrium. Does the result appear to be7053fractal?70547055\index{single source sand pile}70567057You can read more about this version of the sand pile model at7058\url{https://thinkcomplex.com/sand}.70597060\end{exercise}706170627063\begin{exercise}70647065In their 1989 paper, Bak, Chen and Creutz suggest that the Game of Life is7066a self-organized critical system (see \url{https://thinkcomplex.com/bak89}).70677068\index{Game of Life}7069\index{self-organized criticality}7070\index{SOC}70717072To replicate their tests, start with a random configuration and run the GoL CA until it stabilizes. Then choose a random cell and flip it. Run the CA until7073it stabilizes again, keeping track of \py{T}, the number7074of time steps it takes, and \py{S}, the number of cells affected.7075Repeat for a large number of trials and plot the distributions7076of \py{T} and \py{S}. Also, estimate the power spectrums of \py{T} and \py{S} as signals in time, and see if they are consistent with pink noise.70777078%TODO: Do this exercise (project idea).70797080\end{exercise}708170827083\begin{exercise}70847085In {\it The Fractal Geometry of Nature}, Benoit Mandelbrot proposes7086what he calls a ``heretical'' explanation for the prevalence of7087heavy-tailed distributions in natural systems. It may not7088be, as Bak suggests, that many systems can generate this behavior in7089isolation. Instead there may be only a few, but interactions between systems might cause the behavior to propagate.70907091\index{Mandelbrot, Benoit}7092\index{Fractal Geometry of Nature@{\it The Fractal Geometry of Nature}}7093\index{heavy-tailed distribution}70947095To support this argument, Mandelbrot points out:70967097\begin{itemize}70987099\item The distribution of observed data is often ``the joint7100effect of a fixed underlying {\em true distribution} and a highly7101variable {\em filter}''.71027103\item Heavy-tailed distributions are robust to filtering; that is,7104``a wide variety of filters leave their asymptotic behavior7105unchanged''.71067107\end{itemize}71087109What do you think of this argument? Would you characterize7110it as reductionist or holist?71117112\index{reductionism}7113\index{holism}71147115\end{exercise}711671177118\begin{exercise}71197120Read about the ``Great Man'' theory of history at7121\url{https://thinkcomplex.com/great}. What implication7122does self-organized criticality have for this theory?71237124\index{Great Man theory}7125\end{exercise}7126712771287129\chapter{Agent-based models}7130\label{agent-based}71317132The models we have seen so far might be characterized as ``rule-based''7133in the sense that they involve systems governed by simple rules. In this7134and the following chapters, we explore {\bf agent-based models}.71357136\index{agent-based model}71377138Agent-based models include {\bf agents} that are7139intended to model people and other entities that gather7140information about the world, make decisions, and take actions.71417142The agents are usually situated in space or in a network, and7143interact with each other locally. They usually have imperfect or7144incomplete information about the world.71457146Often there are differences among agents, unlike previous models where7147all components are identical. And agent-based models often include7148randomness, either among the agents or in the world.71497150Since the 1970s, agent-based modeling has become an important tool in7151economics, other social sciences, and some natural sciences.71527153Agent-based models are useful for modeling the dynamics of systems7154that are not in equilibrium (although they are also used to study7155equilibrium). And they are particularly useful for understanding7156relationships between individual decisions and system behavior.71577158The code for this chapter is in \py{chap09.ipynb}, which is a7159Jupyter notebook in the repository for this book. For more information7160about working with this code, see Section~\ref{code}.716171627163\section{Schelling's Model}71647165In 1969 Thomas Schelling published ``Models of Segregation'',7166which proposed a simple model of racial segregation. You can read it7167at \url{https://thinkcomplex.com/schell}.71687169The Schelling model of the world is a grid where each cell represents a house. The houses are occupied by two kinds of agents,7170labeled red and blue, in roughly equal numbers. About 10\% of the7171houses are empty.71727173\index{grid}7174\index{Schelling, Thomas}71757176At any point in time, an agent might be happy or unhappy, depending7177on the other agents in the neighborhood, where the7178``neighborhood" of each house is the set of eight adjacent cells.7179In one version of the model, agents are happy if they have at least7180two neighbors like themselves, and unhappy if they have one or zero.71817182\index{agent-based model}7183\index{neighborhood}71847185The simulation proceeds by choosing an agent at random and checking7186to see whether they are happy. If so, nothing happens; if not,7187the agent chooses one of the unoccupied cells at random and moves.71887189You will not be surprised to hear that this model leads to some7190segregation, but you might be surprised by the degree. From a random starting point, clusters of similar agents form almost immediately. The clusters7191grow and coalesce over time until there are a small number7192of large clusters and most agents live in homogeneous7193neighborhoods.71947195\index{cluster}7196\index{segregation}71977198If you did not know the process and only saw the result, you might7199assume that the agents were racist, but in fact all of them7200would be perfectly happy in a mixed neighborhood. Since they prefer7201not to be greatly outnumbered, they might be considered mildly xenophobic. Of course, these agents are a wild simplification of real people, so it may not be appropriate to apply these descriptions at all.72027203\index{racism}7204\index{xenophobia}72057206Racism is a complex human problem; it is hard to imagine that such a7207simple model could shed light on it. But in fact it provides a strong7208argument about the relationship between a system and its parts: if you7209observe segregation in a real city, you cannot conclude that7210individual racism is the immediate cause, or even that the people in7211the city are racists.72127213\index{causation}72147215Of course, we have to keep in mind the limitations of this argument:7216Schelling's model demonstrates a possible cause of segregation, but7217says nothing about actual causes.721872197220\section{Implementation of Schelling's model}72217222To implement Schelling's model, I wrote yet another class that7223inherits from \py{Cell2D}:72247225\begin{code}7226class Schelling(Cell2D):72277228def __init__(self, n, p):7229self.p = p7230choices = [0, 1, 2]7231probs = [0.1, 0.45, 0.45]7232self.array = np.random.choice(choices, (n, n), p=probs)7233\end{code}72347235\py{n} is the size of the grid,7236and \py{p} is the threshold on the fraction of similar neighbors.7237For example, if \py{p=0.3}, an agent will be unhappy if fewer7238than 30\% of their neighbors are the same color.72397240\index{NumPy}7241\index{random}7242\index{choice}72437244\py{array} is a NumPy array where each cell is 0 if empty, 1 if7245occupied by a red agent, and 2 if occupied by a blue agent.7246Initially 10\% of the cells are empty, 45\% red, and 45\% blue.72477248The \py{step} function for Schelling's model is substantially more7249complicated than previous examples. If you are not7250interested in the details, you can skip to the next section.7251But if you stick around, you might pick up some NumPy tips.72527253\index{boolean array}72547255First, I make boolean arrays that indicate which cells are red, blue, and empty:72567257\begin{code}7258a = self.array7259red = a==17260blue = a==27261empty = a==07262\end{code}72637264Then I use \py{correlate2d} to count, for each location,7265the number of neighboring cells that are red, blue, and non-empty.7266We saw \py{correlate2d} in Section~\ref{implife}.72677268\index{SciPy}7269\index{correlate2d}72707271\begin{code}7272options = dict(mode='same', boundary='wrap')72737274kernel = np.array([[1, 1, 1],7275[1, 0, 1],7276[1, 1, 1]], dtype=np.int8)72777278num_red = correlate2d(red, kernel, **options)7279num_blue = correlate2d(blue, kernel, **options)7280num_neighbors = num_red + num_blue7281\end{code}72827283\py{options} is a dictionary that contains the options we pass to \py{correlate2d}. With \py{mode='same'}, the result is the same size as the input. With \py{boundary='wrap'}, the top edge is wrapped to meet the bottom, and the left edge is wrapped to meet the right.72847285\py{kernel} indicates that we want to consider the eight neighbors that surround each cell.72867287After computing \py{num_red} and \py{num_blue}, we can compute the fraction of neighbors, for each location, that are red and blue.72887289\begin{code}7290frac_red = num_red / num_neighbors7291frac_blue = num_blue / num_neighbors7292\end{code}72937294Then, we can compute the fraction of neighbors, for each agent, that are the same color as the agent. I use \py{np.where}, which is like an element-wise \py{if} expression. The first parameter is a condition that selects elements from7295the second or third parameter.72967297\index{NumPy}7298\index{where}72997300\begin{code}7301frac_same = np.where(red, frac_red, frac_blue)7302frac_same[empty] = np.nan7303\end{code}73047305In this case, wherever \py{red} is \py{True}, \py{frac_same} gets7306the corresponding element of \py{frac_red}. Where \py{red} is7307\py{False}, \py{frac_same} gets the corresponding element of \py{frac_blue}.7308Finally, where \py{empty} indicates that a cell is empty, \py{frac_same} is set to \py{np.nan}, which is a special value that indicates ``Not a Number''.73097310\index{NaN}7311\index{Not a Number}73127313Now we can identify the locations of the unhappy agents:73147315\begin{code}7316unhappy = frac_same < self.p7317unhappy_locs = locs_where(unhappy)7318\end{code}73197320\py{locs_where} is a wrapper function for \py{np.nonzero}:73217322\begin{code}7323def locs_where(condition):7324return list(zip(*np.nonzero(condition)))7325\end{code}73267327\py{np.nonzero} takes an array and returns the coordinates of7328all non-zero cells; the result is a tuple of arrays, one for7329each dimension. Then \py{locs_where} uses \py{list} and \py{zip} to convert this result to a list of coordinate pairs.73307331\index{nonzero}7332\index{zip}73337334Similarly, \py{empty_locs} is an array that contains the coordinates7335of the empty cells:73367337\begin{code}7338empty_locs = locs_where(empty)7339\end{code}73407341Now we get to the core of the simulation. We loop through the7342unhappy agents and move them:73437344\begin{code}7345num_empty = np.sum(empty)7346for source in unhappy_locs:7347i = np.random.randint(num_empty)7348dest = empty_locs[i]73497350a[dest] = a[source]7351a[source] = 07352empty_locs[i] = source7353\end{code}73547355\index{NumPy}7356\index{random}7357\index{randint}73587359\py{i} is the index of a random empty cell;7360\py{dest} is a tuple containing the coordinates of the empty cell.73617362\index{tuple}73637364In order to move an agent, we copy its value (1 or 2) from \py{source} to7365\py{dest}, and then set the value of \py{source} to 0 (since it is now7366empty).73677368Finally, we replace the entry in \py{empty_locs} with \py{source}, so the7369cell that just became empty can be chosen by the next agent.737073717372\section{Segregation}73737374\begin{figure}7375\centerline{\includegraphics[height=3in]{figs/chap09-1.pdf}}7376\caption{Schelling's segregation model with \py{n=100}, initial7377condition (left), after 2 steps (middle), and after 10 steps (right).}7378\label{chap09-1}7379\end{figure}73807381%TODO: Another caption with \py in it.73827383Now let's see what happens when we run the model. I'll start7384with \py{n=100} and \py{p=0.3}, and run for 10 steps.73857386\begin{code}7387grid = Schelling(n=100, p=0.3)7388for i in range(10):7389grid.step()7390\end{code}73917392Figure~\ref{chap09-1} shows the initial configuration (left),7393the state of the simulation after 2 steps (middle), and the state after 107394steps (right).73957396\index{segregation}7397\index{cluster}73987399Clusters form almost immediately and grow quickly, until most agents live in highly-segregated neighborhoods.74007401As the simulation runs, we can compute the degree of segregation,7402which is the average, across agents, of the fraction of neighbors who are the same color as the agent:74037404\begin{code}7405np.nanmean(frac_same)7406\end{code}74077408\index{NumPy}7409\index{nanmean}74107411In Figure~\ref{chap09-1}, the average fraction of similar7412neighbors is 50\% in the initial configuration, 65\% after two7413steps, and 76\% after 10 steps!74147415Remember that when \py{p=0.3} the agents would be happy if 3 of 87416neighbors were their own color, but they end up living in7417neighborhoods where 6 or 7 of their neighbors are their own color,7418typically.741974207421\begin{figure}7422\centerline{\includegraphics[height=3in]{figs/chap09-2.pdf}}7423\caption{Degree of segregation in Schelling's model, over time,7424for a range of \py{p}.}7425\label{chap09-2}7426\end{figure}74277428Figure~\ref{chap09-2} shows how the degree of segregation increases7429and where it levels off for several values of \py{p}. When \py{p=0.4},7430the degree of segregation in steady state is about 82\%, and a majority7431of agents have no neighbors with a different color.74327433These results are surprising to many people, and they make a striking7434example of the unpredictable relationship between7435individual decisions and system behavior.743674377438\section{Sugarscape}74397440In 1996 Joshua Epstein and Robert Axtell proposed Sugarscape, an7441agent-based model of an ``artificial society'' intended to support7442experiments related to economics and other social sciences.74437444\index{Epstein, Joshua}7445\index{Axtell, Robert}7446\index{Sugarscape}7447\index{artificial society}74487449Sugarscape is a versatile model that has been adapted for a wide7450variety of topics. As examples, I will replicate the7451first few experiments from Epstein and Axtell's book, {\it Growing7452Artificial Societies}.74537454\index{grid}74557456In its simplest form, Sugarscape is a model of a simple economy where7457agents move around on a 2-D grid, harvesting and accumulating ``sugar'',7458which represents economic wealth. Some parts of the grid produce more7459sugar than others, and some agents are better at finding it than7460others.74617462\index{wealth}7463\index{inequality}74647465This version of Sugarscape is often used to explore and explain the7466distribution of wealth, in particular the tendency toward inequality.74677468In the Sugarscape grid, each cell has a capacity, which is the maximum7469amount of sugar it can hold. In the original configuration, there7470are two high-sugar regions, with capacity 4, surrounded by concentric7471rings with capacities 3, 2, and 1.74727473\begin{figure}7474\centerline{\includegraphics[height=3in]{figs/chap09-3.pdf}}7475\caption{Replication of the original Sugarscape model: initial7476configuration (left), after 2 steps (middle) and after 100 steps (right).}7477\label{chap09-3}7478\end{figure}74797480Figure~\ref{chap09-3} (left) shows the initial configuration, with the darker7481areas indicating cells with higher capacity, and small dots7482representing the agents.74837484\index{agent}74857486Initially there are 400 agents placed at random locations. Each7487agent has three randomly-chosen attributes:74887489\begin{description}74907491\item[Sugar:] Each agent starts with an endowment of sugar chosen7492from a uniform distribution between 5 and 25 units.74937494\item[Metabolism:] Each agent has some amount of sugar they must7495consume per time step, chosen uniformly between 1 and 4.74967497\item[Vision:] Each agent can ``see'' the amount7498of sugar in nearby cells and move to the cell with the most, but7499some agents can see and move farther than others. The distance agents7500see is chosen uniformly between 1 and 6.75017502\end{description}75037504During each time step, agents move one at a time in a random order.7505Each agent follows these rules:75067507\begin{itemize}75087509\item The agent surveys \py{k} cells in each of the 4 compass directions,7510where \py{k} is the range of the agent's vision.75117512\item It chooses the unoccupied cell with the most sugar. In case7513of a tie, it chooses the closer cell; among cells at the same7514distance, it chooses randomly.75157516\item The agent moves to the selected cell and harvests the sugar,7517adding the harvest to its accumulated wealth and leaving the cell7518empty.75197520\item The agent consumes some part of its wealth, depending on its7521metabolism. If the resulting total is negative, the agent ``starves''7522and is removed.75237524\end{itemize}75257526After all agents have executed these steps, the cells grow back7527some sugar, typically 1 unit, but the total sugar in each cell is7528bounded by its capacity.75297530Figure~\ref{chap09-3} (middle) shows the state of the model after two7531steps. Most agents are moving toward the areas with the most sugar.7532Agents with high vision move the fastest; agents with low vision7533tend to get stuck on the plateaus, wandering randomly until they get7534close enough to see the next level.75357536Agents born in the areas with the least sugar are likely to starve7537unless they have a high initial endowment and high vision.75387539Within the high-sugar areas, agents compete with each other to7540find and harvest sugar as it grows back. Agents with high metabolism7541or low vision are the most likely to starve.75427543When sugar grows back at 1 unit per time step, there is not enough7544sugar to sustain the 400 agents we started with. The population7545drops quickly at first, then more slowly, and levels off around 250.75467547Figure~\ref{chap09-3} (right) shows the state of the model after 1007548time steps, with about 250 agents. The agents who survive tend to7549be the lucky ones, born with high vision and/or low metabolism.7550Having survived to this point, they are likely to survive forever,7551accumulating unbounded stockpiles of sugar.755275537554\section{Wealth inequality}75557556In its current form, Sugarscape models a simple ecology, and could7557be used to explore the relationship between the parameters of the7558model, like the growth rate and the attributes of the agents, and7559the carrying capacity of the system (the number of agents that7560survive in steady state). And it models a form of natural selection,7561where agents with higher ``fitness'' are more likely to survive.75627563\index{fitness}75647565The model also demonstrates a kind of wealth inequality, with some7566agents accumulating sugar faster than others. But it would be hard7567to say anything specific about the distribution of wealth because it7568is not ``stationary''; that is, the distribution changes over time and7569does not reach a steady state.75707571\index{stationary}75727573However, if we give the agents finite lifespans, the model produces7574a stationary distribution of wealth. Then we can run experiments to7575see what effect the parameters and rules have on this distribution.75767577\index{distribution}75787579In this version of the model, agents have an age that gets incremented7580each time step, and a random lifespan chosen from a uniform distribution between 60 to 100. If an agent's age exceeds its lifespan, it dies.75817582When an agent dies, from starvation or old age, it is replaced by7583a new agent with random attributes, so the number of agents is7584constant.75857586\begin{figure}7587\centerline{\includegraphics[height=3in]{figs/chap09-4.pdf}}7588\caption{Distribution of sugar (wealth) after 100, 200, 300, and7589400 steps (gray lines) and 500 steps (dark line). Linear scale (left)7590and log-x scale (right). }7591\label{chap09-4}7592\end{figure}75937594Starting with 250 agents (which is close to carrying capacity) I run7595the model for 500 steps. After each 100 steps, I plot the cumulative distribution function (CDF) of sugar accumulated by the agents. We saw CDFs in Section~\ref{cdf}. Figure~\ref{chap09-4} shows the7596results on a linear scale (left) and a log-x scale (right).75977598\index{CDF}7599\index{cumulative distribution function}76007601After about 200 steps (which is twice the longest lifespan) the7602distribution doesn't change much. And it is skewed to the right.76037604Most agents have little accumulated wealth: the 25th percentile is7605about 10 and the median is about 20. But a few agents have accumulated7606much more: the 75th percentile is about 40, and the highest value is7607more than 150.76087609On a log scale the shape of the distribution resembles a Gaussian or7610normal distribution, although the right tail is truncated. If it were7611actually normal on a log scale, the distribution would be lognormal,7612which is a heavy-tailed distribution. And in fact, the distribution7613of wealth in practically every country, and in the world, is a7614heavy-tailed distribution.76157616\index{Gaussian distribution}7617\index{normal distribution}7618\index{lognormal distribution}7619\index{heavy-tailed distribution}76207621It would be too much to claim that Sugarscape explains why wealth7622distributions are heavy-tailed, but the prevalence of inequality in7623variations of Sugarscape suggests that inequality is7624characteristic of many economies, even very simple ones.7625And experiments with rules that model taxation and other income7626transfers suggest that it is not easy to avoid or mitigate.7627762876297630\section{Implementing Sugarscape}76317632Sugarscape is more complicated than the previous models, so I won't7633present the entire implementation here. I will outline the7634structure of the code and you can see the details in the Jupyter notebook7635for this chapter, {\tt chap09.ipynb}, which is in the repository7636for this book. If you are not interested in the details, you7637can skip this section.76387639During each step, the agent moves, harvests sugar, and ages.7640Here is the \py{Agent} class and its \py{step} method:76417642\begin{code}7643class Agent:76447645def step(self, env):7646self.loc = env.look_and_move(self.loc, self.vision)7647self.sugar += env.harvest(self.loc) - self.metabolism7648self.age += 17649\end{code}76507651\index{Sugarscape}76527653The parameter \py{env} is a reference to the environment, which7654is a \py{Sugarscape} object. It provides methods \py{look_and_move} and7655\py{harvest}:76567657\begin{itemize}76587659\item \py{look_and_move} takes the location of the agent, which is a7660tuple of coordinates, and the range of the agent's vision, which is7661an integer. It returns the agent's new location, which is the7662visible cell with the most sugar.76637664\item \py{harvest} takes the (new) location of the agent, and removes and7665returns the sugar at that location.76667667\end{itemize}76687669\py{Sugarscape} inherits from \py{Cell2D}, so it is similar to the7670other grid-based models we've seen.767176727673The attributes include \py{agents}, which is a list of \py{Agent} objects, and \py{occupied}, which is a set of tuples, where each tuple contains the coordinates of a cell occupied by an agent.76747675\index{Cell2D}76767677Here is the \py{Sugarscape} class and its \py{step} method:76787679\begin{code}7680class Sugarscape(Cell2D):76817682def step(self):76837684# loop through the agents in random order7685random_order = np.random.permutation(self.agents)7686for agent in random_order:76877688# mark the current cell unoccupied7689self.occupied.remove(agent.loc)76907691# execute one step7692agent.step(self)76937694# if the agent is dead, remove from the list7695if agent.is_starving():7696self.agents.remove(agent)7697else:7698# otherwise mark its cell occupied7699self.occupied.add(agent.loc)77007701# grow back some sugar7702self.grow()7703return len(self.agents)7704\end{code}77057706\index{NumPy}7707\index{random}7708\index{permutation}77097710During each step, the \py{Sugarscape} uses the NumPy function \py{permutation} so it loops through the agents in7711random order. It invokes \py{step} on each agent and then checks whether7712it is dead. After all agents have moved, some of the sugar grows back.7713The return value is the number of agents still alive.77147715I won't show more details here; you can see them in the notebook for this chapter. If you want to learn more about NumPy, you might want to look at these functions in particular:77167717\begin{itemize}77187719\item \py{make_visible_locs}, which builds the array of locations7720an agent can see, depending on its vision. The locations are sorted by distance, with locations at the same distance7721appearing in random order. This function uses \py{np.random.shuffle}7722and \py{np.vstack}.77237724\index{NumPy}7725\index{random}7726\index{shuffle}77277728\item \py{make_capacity}, which7729initializes the capacity of the cells using NumPy functions7730\py{indices}, \py{hypot}, \py{minimum}, and \py{digitize}.77317732\item \py{look_and_move}, which uses \py{argmax}.77337734\end{itemize}77357736\index{shuffle}7737\index{vstack}7738\index{indices}7739\index{hypot}7740\index{digitize}7741\index{minimum}7742\index{argmax}774377447745\section{Migration and Wave Behavior}77467747\begin{figure}7748\centerline{\includegraphics[height=3in]{figs/chap09-5.pdf}}7749\caption{Wave behavior in Sugarscape: initial configuration (left),7750after 6 steps (middle) and after 12 steps (right). }7751\label{chap09-5}7752\end{figure}77537754Although the purpose of Sugarscape is not primarily to explore the7755movement of agents in space, Epstein and Axtell observed some7756interesting patterns when agents migrate.77577758\index{migration}77597760If we start with all agents in the lower-left corner, they quickly move7761toward the closest ``peak'' of high-capacity cells. But if there7762are more agents than a single peak can support, they quickly exhaust7763the sugar and agents are forced to move into lower-capacity areas.77647765The ones with the longest vision cross the valley between the7766peaks and propagate toward the northeast in a pattern that7767resembles a wave front. Because they leave a stripe of empty cells7768behind them, other agents don't follow until the sugar grows back.77697770\index{wave}7771\index{spaceship}7772\index{Rule 110}77737774The result is a series of discrete waves of migration, where each wave7775resembles a coherent object, like the spaceships we saw in the Rule7776110 CA and Game of Life (see Section~\ref{spaceships} and7777Section~\ref{lifepatterns}).77787779Figure~\ref{chap09-5} shows the initial condition (left) and the7780state of the model after 6 steps (middle) and 12 steps (right).7781You can see the first two waves reaching and moving through the7782second peak, leaving a stripe of empty cells behind. You can7783see an animated version of this model, where the wave patterns are7784more clearly visible, in the notebook for this chapter.77857786These waves7787move diagonally, which is surprising because the agents7788themselves only move north or east, never northeast. Outcomes7789like this --- groups or ``aggregates'' with properties and behaviors7790that the agents don't have --- are common in agent-based models.7791We will see more examples in the next chapter.779277937794\section{Emergence}77957796The examples in this chapter demonstrate one of the most important7797ideas in complexity science: emergence. An {\bf emergent property} is7798a characteristic of a system that results from the interaction of its7799components, not from their properties.78007801\index{emergence}7802\index{emergent property}78037804To clarify what emergence is, it helps to consider what it isn't. For7805example, a brick wall is hard because bricks and mortar are hard, so7806that's not an emergent property. As another example, some rigid7807structures are built from flexible components, so that seems like a7808kind of emergence. But it is at best a weak kind, because structural7809properties follow from well understood laws of mechanics.78107811\index{brick wall}78127813In contrast, the segregation we see in Schelling's model is an emergent7814property because it is not caused by racist agents. Even when the7815agents are only mildly xenophobic, the outcome of the system is7816substantially different from the intention of the agent's decisions.78177818\index{Schelling's model}78197820The distribution of wealth in Sugarscape might be an emergent7821property, but it is a weak example because we could7822reasonably predict it based on the distributions of vision, metabolism,7823and lifespan. The wave behavior we saw in the last example might7824be a stronger example, since the wave displays a capability --- diagonal7825movement --- that the agents do not have.78267827Emergent properties are surprising: it is hard to predict the behavior7828of the system even if we know all the rules. That difficulty is not7829an accident; in fact, it may be the defining characteristic of emergence.78307831As Wolfram discusses in {\em A New Kind of Science}, conventional science7832is based on the axiom that if you know the rules that govern a system,7833you can predict its behavior. What we call ``laws'' are often7834computational shortcuts that allow us to predict the outcome of a7835system without building or observing it.78367837\index{New Kind of Science@{\it A New Kind of Science}}7838\index{Wolfram, Stephen}7839\index{natural law}78407841But many cellular automatons are {\bf computationally irreducible},7842which means that there are no shortcuts. The only way to get the7843outcome is to implement the system.78447845\index{computationally irreducible}7846\index{shortcut}78477848The same may be true of complex systems in general. For physical7849systems with more than a few components, there is usually no model7850that yields an analytic solution. Numerical methods provide a kind of7851computational shortcut, but there is still a qualitative difference.78527853\index{constant-time algorithm}78547855Analytic solutions often provide a constant-time algorithm for7856prediction; that is, the run time of the computation does not depend7857on $t$, the time scale of prediction. But numerical methods,7858simulation, analog computation, and similar methods take time7859proportional to $t$. And for many systems, there is a bound on $t$7860beyond which we can't compute reliable predictions at all.78617862These observations suggest that emergent properties are fundamentally7863unpredictable, and that for complex systems we should not expect to7864find natural laws in the form of computational shortcuts.78657866To some people, ``emergence'' is another name for ignorance; by this7867reckoning, a property is emergent if we don't have a reductionist7868explanation for it, but if we come to understand it better in the7869future, it would no longer be emergent.78707871The status of emergent properties is a topic of debate, so it is7872appropriate to be skeptical. When we see an apparently emergent7873property, we should not assume that there can never be a reductionist7874explanation. But neither should we assume that there has to be one.78757876The examples in this book and the principle of computational7877equivalence give good reasons to believe that at least some emergent7878properties can never be ``explained'' by a classical reductionist7879model.78807881You can read more about emergence at7882\url{https://thinkcomplex.com/emerge}.788378847885\section{Exercises}788678877888The code for this chapter is in the Jupyter notebook {\tt chap09.ipynb}7889in the repository for this book. Open this notebook, read the code,7890and run the cells. You can use this notebook to work on the following7891exercises. My solutions are in {\tt chap09soln.ipynb}.789278937894\begin{exercise}78957896Bill Bishop, author of {\em The Big Sort}, argues that7897American society is increasingly segregated by political7898opinion, as people choose to live among like-minded neighbors.78997900\index{Big Sort@{\em The Big Sort}}7901\index{Bishop, Bill}79027903The mechanism Bishop hypothesizes is not that people, like the agents7904in Schelling's model, are more likely to move if they are7905isolated, but that when they move for any reason, they are7906likely to choose a neighborhood with people like themselves.79077908\index{segregation}79097910Modify your implementation of Schelling's model to simulate7911this kind of behavior and see if it yields similar degrees of7912segregation.79137914There are several ways you can model Bishop's hypothesis. In my7915implementation, a random selection of agents moves during each step.7916Each agent considers \py{k} randomly-chosen empty locations and7917chooses the one with the highest fraction of similar neighbors.7918How does the degree of segregation depend on \py{k}?79197920\end{exercise}792179227923\begin{exercise}79247925In the first version of SugarScape, we never add agents, so once the7926population falls, it never recovers. In the second version, we only7927replace agents when they die, so the population is constant. Now7928let's see what happens if we add some ``population pressure''.79297930\index{population pressure}79317932Write a version of SugarScape that adds a new agent at the end of7933every step. Add code to compute the average vision and the average7934metabolism of the agents at the end of each step. Run the model for a7935few hundred steps and plot the population over time, as well as the7936average vision and average metabolism.79377938You should be able to implement this model by inheriting from7939\py{SugarScape} and overriding \py{__init__} and \py{step}.79407941\end{exercise}794279437944\begin{exercise}79457946Among people who study philosophy of mind, ``Strong AI" is the theory that an7947appropriately-programmed computer could have a mind in the same sense7948that humans have minds.79497950\index{strong AI}7951\index{Searle, John}7952\index{Chinese Room argument}7953\index{system reply}79547955John Searle presented a thought experiment called ``The Chinese Room'',7956intended to show that Strong AI is false. You can read about7957it at \url{https://thinkcomplex.com/searle}.79587959What is the {\bf system reply} to the Chinese Room argument?7960How does what you have learned about emergence influence7961your reaction to the system response?79627963\end{exercise}7964796579667967\chapter{Herds, Flocks, and Traffic Jams}79687969The agent-based models in the previous chapter are based on grids: the agents occupy discrete locations in two-dimensional space. In this chapter we consider agents that move is continuous space, including simulated cars on a one-dimensional highway and simulated birds in three-dimensional space.79707971\index{grid}7972\index{space}7973\index{dimension}79747975The code for this chapter is in \py{chap10.ipynb}, which is a7976Jupyter notebook in the repository for this book. For more information7977about working with this code, see Section~\ref{code}.797879797980\section{Traffic jams}79817982What causes traffic jams? Sometimes there is an obvious cause,7983like an accident, a speed trap, or something else that disturbs7984the flow of traffic. But other times traffic jams appear for no7985apparent reason.79867987\index{traffic jam}79887989Agent-based models can help explain spontaneous traffic jams.7990As an example, I implement a highway simulation based on7991a model in Mitchell Resnick's book, {\em Turtles, Termites and Traffic Jams}.79927993\index{Turtles, Termites and Traffic Jams@{\em Turtles, Termites and Traffic Jams}}7994\index{Resnick, Mitchell}7995\index{highway}79967997Here's the class that represents the ``highway'':79987999\begin{code}8000class Highway:80018002def __init__(self, n=10, length=1000, eps=0):8003self.length = length8004self.eps = eps80058006# create the drivers8007locs = np.linspace(0, length, n, endpoint=False)8008self.drivers = [Driver(loc) for loc in locs]80098010# and link them up8011for i in range(n):8012j = (i+1) % n8013self.drivers[i].next = self.drivers[j]8014\end{code}80158016\py{n} is the number of cars, \py{length} is the length of the8017highway, and \py{eps} is the amount of random noise we'll add to the system.80188019\index{Highway}8020\index{Driver}80218022\index{NumPy}8023\index{linspace}80248025\py{locs} contains the locations of the drivers; the NumPy function \py{linspace} creates an array of \py{n} locations equally spaced between \py{0} and \py{length}.80268027The \py{drivers} attribute is a list of \py{Driver} objects. The \py{for} loop links them so each \py{Driver} contains a reference to the next. The highway is circular, so the last \py{Driver} contains a reference to the first.80288029During each time step, the \py{Highway} moves each of the drivers:80308031\begin{code}8032# Highway80338034def step(self):8035for driver in self.drivers:8036self.move(driver)8037\end{code}80388039The \py{move} method lets the \py{Driver} choose its acceleration. Then \py{move} computes the updated speed and position. Here's the implementation:80408041\begin{code}8042# Highway80438044def move(self, driver):8045dist = self.distance(driver)80468047# let the driver choose acceleration8048acc = driver.choose_acceleration(dist)8049acc = min(acc, self.max_acc)8050acc = max(acc, self.min_acc)8051speed = driver.speed + acc80528053# add random noise to speed8054speed *= np.random.uniform(1-self.eps, 1+self.eps)80558056# keep it nonnegative and under the speed limit8057speed = max(speed, 0)8058speed = min(speed, self.speed_limit)80598060# if current speed would collide, stop8061if speed > dist:8062speed = 080638064# update speed and loc8065driver.speed = speed8066driver.loc += speed8067\end{code}80688069\index{NumPy}8070\index{random}8071\index{uniform}80728073\py{dist} is the distance between \py{driver} and the next driver ahead.8074This distance is passed to \py{choose_acceleration}, which8075specifies the behavior of the driver. This is the only decision the8076driver gets to make; everything else is determined by the ``physics''8077of the simulation.80788079\begin{itemize}80808081\item \py{acc} is acceleration, which is bounded by \py{min_acc} and8082\py{max_acc}. In my implementation, cars can accelerate with8083\py{max_acc=1} and brake with \py{min_acc=-10}.80848085\item \py{speed} is the old speed plus the requested acceleration, but8086then we make some adjustments. First, we add random noise to8087\py{speed}, because the world is not perfect. \py{eps} determines the magnitude of the relative error; for example, if \py{eps} is 0.02, \py{speed} is multiplied by a random factor between 0.98 and 1.02.80888089\item Speed is bounded between 0 and \py{speed_limit}, which is809040 in my implementation, so cars are not allowed to go backward or speed.80918092\item If the requested speed would cause a collision with the next car,8093\py{speed} is set to 0.80948095\item Finally, we update the \py{speed} and \py{loc} attributes of \py{driver}.80968097\end{itemize}80988099\index{collision}81008101Here's the definition for the \py{Driver} class:81028103\begin{code}8104class Driver:81058106def __init__(self, loc, speed=0):8107self.loc = loc8108self.speed = speed81098110def choose_acceleration(self, dist):8111return 18112\end{code}81138114The attributes \py{loc} and \py{speed} are the location and speed of the8115driver.81168117This implementation of \py{choose_acceleration} is simple:8118it always accelerates at the maximum rate.81198120Since the cars start out equally spaced, we expect them all to8121accelerate until they reach the speed limit, or until their speed8122exceeds the space between them. At that point, at least one8123``collision'' will occur, causing some cars to stop.81248125\begin{figure}8126\centerline{\includegraphics[height=3in]{figs/chap10-1.pdf}}8127\caption{Simulation of drivers on a circular highway at three points in time. Squares indicate the position of the drivers; triangles indicate places where one driver has to brake to avoid another.}8128\label{chap10-1}8129\end{figure}81308131Figure~\ref{chap10-1} shows a few steps in this process, starting with813230 cars and \py{eps=0.02}. On the left is the configuration after813316 time steps, with the highway mapped to a circle.8134Because of random noise, some cars are going faster8135than others, and the spacing has become uneven.81368137During the next time step (middle) there are two collisions, indicated8138by the triangles.81398140During the next time step (right) two cars collide with the stopped8141cars, and we can see the initial formation of a traffic jam. Once8142a jam forms, it tends to persist, with additional cars approaching from8143behind and colliding, and with cars in the front accelerating8144away.81458146Under some conditions, the jam itself propagates backwards, as you8147can see if you watch the animations in the notebook for this chapter.814881498150\section{Random perturbation}81518152\begin{figure}8153\centerline{\includegraphics[height=3in]{figs/chap10-2.pdf}}8154\caption{Average speed as a function of the number of cars, for three magnitudes of added random noise.}8155\label{chap10-2}8156\end{figure}81578158As the number of cars increases, traffic jams become more severe.8159Figure~\ref{chap10-2} shows the average speed cars are able to8160achieve, as a function of the number of cars.81618162\index{noise}81638164The top line shows results with \py{eps=0}, that is, with no random8165variation in speed. With 25 or fewer cars, the spacing between8166cars is greater than 40, which allows cars to reach and maintain8167the maximum speed, which is 40. With more than 25 cars, traffic8168jams form and the average speed drops quickly.81698170This effect is a direct result of the physics of the simulation,8171so it should not be surprising. If the length of the road is81721000, the spacing between \py{n} cars is \py{1000/n}. And since cars8173can't move faster than the space in front of them, the highest8174average speed we expect is \py{1000/n} or 40, whichever is less.81758176But that's the best case scenario. With just a small amount of8177randomness, things get much worse.81788179Figure~\ref{chap10-2} also shows results with \py{eps=0.001} and8180\py{eps=0.01}, which correspond to errors in speed of 0.1\% and 1\%.81818182With 0.1\% errors, the capacity of the highway8183drops from 25 to 20 (by ``capacity'' I mean the maximum number8184of cars that can reach and sustain the speed limit). And8185with 1\% errors, the capacity drops to 10. Ugh.81868187As one of the exercises at the end of this chapter, you'll have a8188chance to design a better driver; that is, you will experiment with8189different strategies in \py{choose_acceleration} and see if you can8190find driver behaviors that improve average speed.819181928193\section{Boids}81948195In 1987 Craig Reynolds published ``Flocks, herds and8196schools: A distributed behavioral model'', which describes an8197agent-based model of herd behavior. You can download his8198paper from \url{https://thinkcomplex.com/boid}.81998200\index{Reynolds, Craig}82018202Agents in this model are called ``Boids'', which is both a8203contraction of ``bird-oid'' and an accented pronunciation of ``bird''8204(although Boids are also used to model fish and herding land animals).82058206\index{Boid}82078208Each agent simulates three behaviors:82098210\begin{description}82118212\item[Flock centering:] Move toward the center of the flock.82138214\item[Collision avoidance:] Avoid obstacles, including other Boids.82158216\item[Velocity matching:] Align velocity (speed and direction) with neighboring Boids.82178218\end{description}82198220Boids make decisions based on local information only; each Boid8221only sees (or pays attention to) other Boids in its field of8222vision.82238224\index{local information}82258226In the repository for this book, you will find {\tt Boids7.py}, which contains my implementation of Boids, based in part8227on the description in Gary William Flake's book, {\it The Computational Beauty of Nature}.82288229\index{Flake, Gary William}8230\index{Computational Beauty of Nature@{\it The Computational Beauty of Nature}}8231\index{VPython}8232\index{Anaconda}82338234My implementation uses VPython, which is a library that provides 3-D graphics.8235VPython provides a \py{Vector} object, which I use to represent the position and velocity of Boids in three dimensions. You can read about them at \url{https://thinkcomplex.com/vector}.82368237\index{Vector}82388239\section{The Boid algorithm}82408241\py{Boids7.py} defines two classes: \py{Boid}, which implements the Boid8242behaviors, and \py{World}, which contains a list of Boids and a8243``carrot'' the Boids are attracted to.82448245\index{carrot}82468247The \py{Boid} class defines the following methods:82488249\begin{itemize}82508251\item \py{center}: Finds other Boids within range and computes a vector8252toward their centroid.82538254\index{centroid}82558256\item \py{avoid}: Finds objects, including other Boids, within a given range,8257and computes a vector that points away from their centroid.82588259\item \py{align}: Finds other Boids within range and computes the average of their headings.82608261\item \py{love}: Computes a vector that points toward the carrot.82628263\end{itemize}82648265Here's the implementation of \py{center}:82668267\begin{code}8268def center(self, boids, radius=1, angle=1):8269neighbors = self.get_neighbors(boids, radius, angle)8270vecs = [boid.pos for boid in neighbors]8271return self.vector_toward_center(vecs)8272\end{code}82738274The parameters \py{radius} and \py{angle} are the radius and angle of8275the field of view, which determines which other Boids are taken into consideration. \py{radius} is in arbitrary units of length; \py{angle} is in radians.82768277\py{center} uses \py{get_neighbors} to get a list of \py{Boid} objects that are in the field of view. \py{vecs} is a list of \py{Vector} objects that represent their positions.82788279Finally, \py{vector_toward_center} computes a \py{Vector} that points from \py{self} to the centroid of \py{neighbors}.82808281\index{centroid}82828283Here's how \py{get_neighbors} works:82848285\begin{code}8286def get_neighbors(self, boids, radius, angle):8287neighbors = []8288for boid in boids:8289if boid is self:8290continue82918292# if not in range, skip it8293offset = boid.pos - self.pos8294if offset.mag > radius:8295continue82968297# if not within viewing angle, skip it8298if self.vel.diff_angle(offset) > angle:8299continue83008301# otherwise add it to the list8302neighbors.append(boid)83038304return neighbors8305\end{code}83068307For each other Boid, \py{get_neighbors} uses vector subtraction to compute the8308vector from \py{self} to \py{boid}. The magnitude of this vector is the distance between them; if this magnitude exceeds \py{radius}, we ignore \py{boid}.83098310\index{magnitude}8311\index{angle}83128313\py{diff_angle} computes the angle between the velocity of \py{self}, which points in the direction the Boid is heading, and the position of \py{boid}.8314If this angle exceeds \py{angle}, we ignore \py{boid}.83158316Otherwise \py{boid} is in view, so we add it to \py{neighbors}.83178318Now here's the implementation of \py{vector_toward_center}, which computes a vector from \py{self} to the centroid of its neighbors.83198320\begin{code}8321def vector_toward_center(self, vecs):8322if vecs:8323center = np.mean(vecs)8324toward = vector(center - self.pos)8325return limit_vector(toward)8326else:8327return null_vector8328\end{code}83298330VPython vectors work with NumPy, so \py{np.mean} computes the mean of \py{vecs}, which is a sequence of vectors. \py{limit_vector} limits the magnitude of the result to 1; \py{null_vector} has magnitude 0.83318332\index{NumPy}8333\index{mean}8334\index{magnitude}83358336We use the same helper methods to implement \py{avoid}:83378338\begin{code}8339def avoid(self, boids, carrot, radius=0.3, angle=np.pi):8340objects = boids + [carrot]8341neighbors = self.get_neighbors(objects, radius, angle)8342vecs = [boid.pos for boid in neighbors]8343return -self.vector_toward_center(vecs)8344\end{code}83458346\py{avoid} is similar to \py{center}, but it takes into account the carrot as well as the other Boids. Also, the parameters are different: \py{radius} is smaller, so Boids only avoid objects that are too close, and \py{angle} is wider, so Boids avoid objects from all directions. Finally, the result from \py{vector_toward_center} is negated, so it points {\em away} from the centroid of any objects that are too close.83478348Here's the implementation of \py{align}:83498350\begin{code}8351def align(self, boids, radius=0.5, angle=1):8352neighbors = self.get_neighbors(boids, radius, angle)8353vecs = [boid.vel for boid in neighbors]8354return self.vector_toward_center(vecs)8355\end{code}83568357\py{align} is also similar to \py{center}; the big difference is that it computes the average of the neighbors' velocities, rather than their positions. If the neighbors point in a particular direction, the Boid tends to steer toward that direction.83588359\index{velocity}83608361Finally, \py{love} computes a vector that points in the direction of the carrot.83628363\begin{code}8364def love(self, carrot):8365toward = carrot.pos - self.pos8366return limit_vector(toward)8367\end{code}83688369The results from \py{center}, \py{avoid}, \py{align}, and \py{love} are what Reynolds calls ``acceleration requests", where each request is intended to achieve a different goal.83708371\index{acceleration}8372\index{arbitration}83738374\section{Arbitration}83758376To arbitrate among these possibly conflicting goals, we compute a weighted sum of the four requests:83778378\index{weighted sum}83798380\begin{code}8381def set_goal(self, boids, carrot):8382w_avoid = 108383w_center = 38384w_align = 18385w_love = 1083868387self.goal = (w_center * self.center(boids) +8388w_avoid * self.avoid(boids, carrot) +8389w_align * self.align(boids) +8390w_love * self.love(carrot))8391self.goal.mag = 18392\end{code}83938394\py{w_center}, \py{w_avoid}, and the other weights determine the importance of the acceleration requests. Usually \py{w_avoid} is relatively high and \py{w_align} is relatively low.83958396After computing a goal for each Boid, we update their velocity and position:83978398\begin{code}8399def move(self, mu=0.1, dt=0.1):8400self.vel = (1-mu) * self.vel + mu * self.goal8401self.vel.mag = 18402self.pos += dt * self.vel8403self.axis = self.length * self.vel8404\end{code}84058406The new velocity is the weighted sum of the old velocity8407and the goal. The parameter \py{mu} determines how quickly8408the birds can change speed and direction. Then we normalize velocity so its magnitude is 1, and orient the axis of the Boid to point in the direction it is moving.84098410\index{maneuverability}84118412To update position, we multiply velocity by the time step, \py{dt}, to get the change in position. Finally, we update \py{axis} so the orientation of the Boid when it is drawn is aligned with its velocity.84138414\index{flock behavior}84158416Many parameters influence flock behavior, including the radius, angle8417and weight for each behavior, as well as maneuverability, \py{mu}.8418These parameters determine the ability of the Boids to form and8419maintain a flock, and the patterns of motion and organization within the8420flock. For some settings, the Boids resemble a flock of birds; other8421settings resemble a school of fish or a cloud of flying insects.84228423As one of the exercises at the end of this chapter, you can modify these parameters and see how they affect Boid behavior.842484258426\section{Emergence and free will}8427\label{freewill}84288429Many complex systems have properties, as a whole, that their components do not:84308431\begin{itemize}84328433\item The Rule 30 cellular automaton is deterministic, and the rules8434that govern its evolution are completely known. Nevertheless, it8435generates a sequence that is statistically indistinguishable from8436random.84378438\item The agents in Schelling's model are not racist, but the outcome8439of their interactions is a high degree of segregation.84408441\item Agents in Sugarscape form waves that move diagonally even8442though the agents cannot.84438444\item Traffic jams move backward even though the cars in them are8445moving forward.84468447\item Flocks and herds behave as if they are centrally organized even though the animals in them are making individual decisions based on local information.84488449\end{itemize}84508451These examples suggest an approach to several old and challenging questions, including the problems of consciousness and free will.84528453\index{emergence}8454\index{free will}8455\index{determinism}8456\index{James, William}8457\index{Hume, David}84588459Free will is the ability to make choices, but if our bodies and brains8460are governed by deterministic physical laws, our choices are completely determined.84618462Philosophers and scientists have proposed many possible resolutions to this apparent conflict; for example:84638464\begin{itemize}84658466\item William James proposed a two-stage model in which8467possible actions are generated by a random process and then selected8468by a deterministic process. In that case our actions are8469fundamentally unpredictable because the process that generates them8470includes a random element.84718472\item David Hume suggested that our perception of making choices8473is an illusion; in that case, our actions are deterministic because8474the system that produces them is deterministic.84758476\end{itemize}84778478These arguments reconcile the conflict in opposite ways, but they agree that there is a conflict: the system cannot have free will if the parts are deterministic.84798480The complex systems in this book suggest the alternative that free will, at the level of options and decisions, is compatible with determinism at the level of neurons (or some lower level). In the same way that a traffic jam moves backward while the cars move forward, a person can have free will even though neurons don't.848184828483\section{Exercises}84848485The code for the traffic jam simulation is in the Jupyter notebook {\tt chap09.ipynb} in the repository for this book. Open this notebook, read the code, and run the cells. You can use this notebook to work on the following8486exercise. My solutions are in {\tt chap09soln.ipynb}.84878488\begin{exercise}84898490In the traffic jam simulation, define a class, \py{BetterDriver}, that inherits from \py{Driver} and overrides \py{choose_acceleration}. See if you can define driving rules that do better than the basic implementation in \py{Driver}. You might try to achieve higher average speed, or a lower number of collisions.84918492% TODO: Add a crash counter. Maybe pass the Highway object to8493% choose_acceleration84948495\end{exercise}8496849784988499\begin{exercise}85008501The code for my Boid implementation is in {\tt Boids7.py} in the repository for this book. To run it, you will need VPython, a library for 3-D graphics and animation. If you use Anaconda (as I recommend in Section~\ref{code}), you can run the following in a terminal or Command Window:85028503\begin{code}8504conda install -c vpython vpython8505\end{code}85068507Then run \py{Boids7.py}. It should either launch a browser or create a window in a running browser, and create an animated display showing Boids, as white cones, circling a red sphere, which is the carrot. If you click and move the mouse, you can move the carrot and see how the Boids react.85088509Read the code to see how the parameters control Boid behaviors. Experiment with different parameters. What happens if you ``turn off'' one8510of the behaviors by setting its weight to 0?85118512\index{parameter}85138514To generate more bird-like behavior, Flake suggests adding a behavior to maintain a clear line of sight; in other words, if there is another bird directly ahead, the Boid should move away laterally. What effect do you expect this rule to have on the behavior of the flock? Implement it and see.85158516\end{exercise}851785188519\begin{exercise}85208521Read more about free will at \url{https://thinkcomplex.com/will}. The view that free will is compatible with determinism is called {\bf compatibilism}. One of the strongest challenges to compatibilism is the ``consequence8522argument''. What is the consequence argument? What response can you8523give to the consequence argument based on what you have read in this8524book?85258526\index{free will}8527\index{determinism}8528\index{compatibilism}8529\index{consequence argument}85308531\end{exercise}853285338534853585368537\chapter{Evolution}85388539The most important idea in biology, and possibly all of science, is the {\bf theory of evolution by natural selection}, which claims that {\em new species are created and existing species change due to natural selection}. Natural selection is a process in which inherited variations between individuals cause differences in survival and reproduction.85408541\index{evolution}8542\index{survival}8543\index{reproduction}8544\index{theory of evolution}85458546Among people who know something about biology, the theory of evolution is widely regarded as a fact, which is to say that it is consistent with all current observations; it is highly unlikely to be contradicted by future observations; and, if it is revised in the future, the changes will almost certainly leave the central ideas substantially intact.85478548\index{Pew Research Center}85498550Nevertheless, many people do not believe in evolution. In a survey run by the Pew Research Center, survey respondents were asked which of the following claims is closer to their view:85518552\begin{enumerate}85538554\item Humans and other living things have evolved over time.85558556\item Humans and other living things have existed in their present form since the beginning of time.85578558\end{enumerate}85598560About 34\% of Americans chose the second (see \url{https://thinkcomplex.com/arda}).85618562Even among the ones who believe that living things have evolved, barely more than half believe that the cause of evolution is natural selection. In other words, only a third of Americans believe that the theory of evolution is true.85638564How is this possible? In my opinion, contributing factors include:85658566\begin{itemize}85678568\item Some people think that there is a conflict between evolution and their religious beliefs. Feeling like they have to reject one, they reject evolution.85698570\item Others have been actively misinformed, often by members of the first group, so that much of what they know about evolution is misleading or false. For example, many people think that evolution means humans evolved from monkeys. It doesn't, and we didn't.85718572\item And many people simply don't know anything about evolution.85738574\end{itemize}85758576There's probably not much I can do about the first group, but I think I can help the others. Empirically, the theory of evolution is hard for people to understand. At the same time, it is profoundly simple: for many people, once they understand it, it seems both obvious and irrefutable.85778578To help people make this transition from confusion to clarity, the most powerful tool I have found is computation. Ideas that are hard to understand in theory can be easy to understand when we see them happening in simulation.8579That is the goal of this chapter.85808581\index{simulation}85828583The code for this chapter is in \py{chap11.ipynb}, which is a8584Jupyter notebook in the repository for this book. For more information8585about working with this code, see Section~\ref{code}.858685878588\section{Simulating evolution}85898590% manifest8591% show evidence of8592% evoke8593% instantiate8594% induce85958596I start with a simple model that demonstrates a basic form of evolution. According to the theory, the following features are sufficient to produce evolution:85978598\begin{itemize}85998600\item Replicators: We need a population of agents that can reproduce8601in some way. We'll start with replicators that make perfect copies8602of themselves. Later we'll add imperfect copying, that is,8603mutation.86048605\item Variation: We need variability in the population, that8606is, differences between individuals.86078608\item Differential survival or reproduction: The differences between8609individuals have to affect their ability to survive or reproduce.86108611\end{itemize}86128613To simulate these features, we'll define a population of agents8614that represent individual organisms. Each agent has genetic8615information, called its {\bf genotype}, which is the information that8616gets copied when the agent replicates. In our model\footnote{This8617model is a variant of the NK model developed primarily by Stuart8618Kauffman (see \url{https://thinkcomplex.com/nk}).}, a8619genotype is represented by a sequence of \py{N} binary digits (zeros8620and ones), where \py{N} is a parameter we choose.86218622\index{genotype}8623\index{Kauffman, Stuart}8624\index{agent}8625\index{survival}8626\index{reproduction}8627\index{fitness}86288629To generate variation, we create a population with a variety of genotypes; later we will explore mechanisms that create or increase variation.86308631Finally, to generate differential survival and reproduction, we define a function that maps from each genotype to a {\bf fitness}, where fitness is a quantity related to the ability of an agent to survive or reproduce.86328633\section{Fitness landscape}86348635The function that maps from genotype to fitness is called a {\bf fitness landscape}. In the landscape metaphor, each genotype corresponds to a location in an \py{N}-dimensional space, and fitness corresponds to the ``height" of the landscape at that location. For visualizations that might clarify this metaphor, see \url{https://thinkcomplex.com/fit}.86368637\index{fitness landscape}8638\index{phenotype}86398640In biological terms, the fitness landscape represents information about how the genotype of an organism is related to its physical form and capabilities, called its {\bf phenotype}, and how the phenotype interacts with its {\bf environment}.86418642In the real world, fitness landscapes are complicated, but we don't need to build a realistic model. To induce evolution, we need {\em some} relationship between genotype and fitness, but it turns out that it can be {\em any} relationship. To demonstrate this point, we'll use a totally random fitness landscape.86438644Here is the definition for a class that represents a fitness landscape:86458646\begin{code}8647class FitnessLandscape:86488649def __init__(self, N):8650self.N = N8651self.one_values = np.random.random(N)8652self.zero_values = np.random.random(N)86538654def fitness(self, loc):8655fs = np.where(loc, self.one_values,8656self.zero_values)8657return fs.mean()8658\end{code}86598660The genotype of an agent, which corresponds to its location in the fitness landscape, is represented by a NumPy array of zeros and ones called \py{loc}.8661The fitness of a given genotype is the mean of \py{N} {\bf fitness contributions}, one for each element of \py{loc}.86628663\index{NumPy}8664\index{random}8665\index{where}86668667To compute the fitness of a genotype, \py{FitnessLandscape} uses two arrays:8668\py{one_values}, which contains the fitness contributions of having a \py{1} in each element of \py{loc}, and \py{zero_values}, which contains the fitness contributions of having a \py{0}.86698670The \py{fitness} method uses \py{np.where} to select a value from \py{one_values} where \py{loc} has a \py{1}, and a value from \py{zero_values} where \py{loc} has a \py{0}.86718672As an example, suppose \py{N=3} and86738674\begin{code}8675one_values = [0.1, 0.2, 0.3]8676zero_values = [0.4, 0.7, 0.9]8677\end{code}86788679In that case, the fitness of \py{loc = [0, 1, 0]} would be the mean of \py{[0.4, 0.2, 0.9]}, which is \py{0.5}.868086818682\section{Agents}86838684Next we need agents. Here's the class definition:86858686\begin{code}8687class Agent:86888689def __init__(self, loc, fit_land):8690self.loc = loc8691self.fit_land = fit_land8692self.fitness = fit_land.fitness(self.loc)86938694def copy(self):8695return Agent(self.loc, self.fit_land)8696\end{code}86978698The attributes of an \py{Agent} are:86998700\begin{itemize}87018702\item \py{loc}: The location of the \py{Agent} in the fitness landscape.87038704\item \py{fit_land}: A reference to a \py{FitnessLandscape} object.87058706\item \py{fitness}: The fitness of this \py{Agent} in the \py{FitnessLandscape}, represented as a number between 0 and 1.87078708\end{itemize}87098710\py{Agent} provides \py{copy}, which copies the genotype exactly. Later, we will see a version that copies with mutation, but mutation is not necessary for evolution.871187128713\section{Simulation}8714\label{evosim}87158716Now that we have agents and a fitness landscape, I'll define a class called \py{Simulation} that simulates the creation, reproduction, and death of the agents. To avoid getting bogged down, I'll present a simplified version of the code here; you can see the details in the notebook for this chapter.87178718\index{simulation}87198720Here's the definition of \py{Simulation}:87218722\begin{code}8723class Simulation:87248725def __init__(self, fit_land, agents):8726self.fit_land = fit_land8727self.agents = agents8728\end{code}87298730The attributes of a \py{Simulation} are:87318732\begin{itemize}87338734\item \py{fit_land}: A reference to a \py{FitnessLandscape} object.87358736\item \py{agents}: An array of \py{Agent} objects.87378738\end{itemize}87398740The most important function in \py{Simulation} is \py{step}, which simulates one time step:87418742\begin{code}8743# class Simulation:87448745def step(self):8746n = len(self.agents)8747fits = self.get_fitnesses()87488749# see who dies8750index_dead = self.choose_dead(fits)8751num_dead = len(index_dead)87528753# replace the dead with copies of the living8754replacements = self.choose_replacements(num_dead, fits)8755self.agents[index_dead] = replacements8756\end{code}87578758\py{step} uses three other methods:87598760\begin{itemize}87618762\item \py{get_fitnesses} returns an array containing the fitness of each agent.87638764\item \py{choose_dead} decides which agents die during this time step, and returns an array that contains the indices of the dead agents.87658766\item \py{choose_replacements} decides which agents reproduce during this time step, invokes \py{copy} on each one, and returns an array of new \py{Agent} objects.87678768\end{itemize}87698770In this version of the simulation, the number of new agents during each time step equals the number of dead agents, so the number of live agents is constant.877187728773\section{No differentiation}87748775Before we run the simulation, we have to specify the behavior of \py{choose_dead} and \py{choose_replacements}. We'll start with simple versions of these functions that don't depend on fitness:87768777\begin{code}8778# class Simulation87798780def choose_dead(self, fits):8781n = len(self.agents)8782is_dead = np.random.random(n) < 0.18783index_dead = np.nonzero(is_dead)[0]8784return index_dead8785\end{code}87868787\index{NumPy}8788\index{random}8789\index{choice}8790\index{nonzero}8791\index{boolean array}87928793In \py{choose_dead}, \py{n} is the number of agents and \py{is_dead} is a boolean array that contains \py{True} for the agents who die during this time step. In this version, every agent has the same probability of dying: 0.1.8794\py{choose_dead} uses \py{np.nonzero} to find the indices of the non-zero elements of \py{is_dead} (\py{True} is considered non-zero).87958796\begin{code}8797# class Simulation87988799def choose_replacements(self, n, fits):8800agents = np.random.choice(self.agents, size=n, replace=True)8801replacements = [agent.copy() for agent in agents]8802return replacements8803\end{code}88048805In \py{choose_replacements}, \py{n} is the number of agents who reproduce during this time step. It uses \py{np.random.choice} to choose \py{n} agents with replacement. Then it invokes \py{copy} on each one and returns a list of new \py{Agent} objects.88068807These methods don't depend on fitness, so this simulation does not have differential survival or reproduction. As a result, we should not expect to see evolution. But how can we tell?880888098810\section{Evidence of evolution}8811\label{instrument}88128813The most inclusive definition of evolution is a change in the distribution of genotypes in a population. Evolution is an aggregate effect: in other words, individuals don't evolve; populations do.88148815In this simulation, genotypes are locations in a high-dimensional space, so it is hard to visualize changes in their distribution. However, if the genotypes change, we expect their fitness to change as well. So we will use {\em changes in the distribution of fitness} as evidence of evolution. In particular, we'll look at the mean and standard deviation of fitness over time.88168817Before we run the simulation, we have to add an \py{Instrument}, which is an object that gets updated after each time step, computes a statistic of interest, or ``metric", and stores the result in a sequence we can plot later.88188819Here is the parent class for all instruments:88208821\begin{code}8822class Instrument:8823def __init__(self):8824self.metrics = []8825\end{code}88268827And here's the definition for \py{MeanFitness}, an instrument that computes the mean fitness of the population at each time step:88288829\begin{code}8830class MeanFitness(Instrument):8831def update(self, sim):8832mean = np.nanmean(sim.get_fitnesses())8833self.metrics.append(mean)8834\end{code}88358836\index{NumPy}8837\index{nanmean}88388839Now we're ready to run the simulation. To avoid the effect of random changes in the starting population, we start every simulation with the same set of agents. And to make sure we explore the entire fitness landscape, we start with one agent at every location. Here's the code that creates the \py{Simulation}:88408841\begin{code}8842N = 88843fit_land = FitnessLandscape(N)8844agents = make_all_agents(fit_land, Agent)8845sim = Simulation(fit_land, agents)8846\end{code}88478848\py{make_all_agents} creates one \py{Agent} for every location; the implementation is in the notebook for this chapter.88498850Now we can create and add a \py{MeanFitness} instrument, run the simulation, and plot the results:88518852\begin{code}8853instrument = MeanFitness()8854sim.add_instrument(instrument)8855sim.run()8856\end{code}88578858\py{Simulation} keeps a list of \py{Instrument} objects. After each time step it invokes \py{update} on each \py{Instrument} in the list.88598860\index{Instrument}88618862\begin{figure}8863\centerline{\includegraphics[height=3in]{figs/chap11-1.pdf}}8864\caption{Mean fitness over time for 10 simulations with no differential survival or reproduction.}8865\label{chap11-1}8866\end{figure}88678868Figure~\ref{chap11-1} shows the result of running this simulation 10 times. The mean fitness of the population drifts up or down at random. Since the distribution of fitness changes over time, we infer that the distribution of phenotypes is also changing. By the most inclusive definition, this {\bf random walk} is a kind of evolution. But it is not a particularly interesting kind.88698870\index{adaptation}8871\index{diversity}88728873In particular, this kind of evolution does not explain how biological species change over time, or how new species appear. The theory of evolution is powerful because it explains phenomena we see in the natural world that seem inexplicable:88748875\begin{itemize}88768877\item Adaptation: Species interact with their environments in ways that seem too complex, too intricate, and too clever to happen by chance. Many features of natural systems seem as if they were designed.88788879\item Increasing diversity: Over time the number of species on earth has generally increased (despite several periods of mass extinction).88808881\item Increasing complexity: The history of life on earth starts with relatively simple life forms, with more complex organisms appearing later in the geological record.88828883\end{itemize}88848885These are the phenomena we want to explain. So far, our model doesn't do the job.888688878888\section{Differential survival}8889\label{diffsurv}88908891Let's add one more ingredient, differential survival. Here's a class that extends \py{Simulation} and overrides \py{choose_dead}:88928893\begin{code}8894class SimWithDiffSurvival(Simulation):88958896def choose_dead(self, fits):8897n = len(self.agents)8898is_dead = np.random.random(n) > fits8899index_dead = np.nonzero(is_dead)[0]8900return index_dead8901\end{code}89028903\index{NumPy}8904\index{random}89058906Now the probability of survival depends on fitness; in fact, in this version, the probability that an agent survives each time step {\em is} its fitness.89078908\index{differential survival}89098910Since agents with low fitness are more likely to die, agents with high fitness are more likely to survive long enough to reproduce. Over time we expect the number of low-fitness agents to decrease, and the number of high-fitness agents to increase.89118912\begin{figure}8913\centerline{\includegraphics[height=3in]{figs/chap11-2.pdf}}8914\caption{Mean fitness over time for 10 simulations with differential survival.}8915\label{chap11-2}8916\end{figure}89178918Figure~\ref{chap11-2} shows mean fitness over time for 10 simulations with differential survival. Mean fitness increases quickly at first, but then levels off.89198920You can probably figure out why it levels off: if there is only one agent at a particular location and it dies, it leaves that location unoccupied. Without mutation, there is no way for it to be occupied again.89218922With \py{N=8}, this simulation starts with 256 agents occupying all possible locations. Over time, the number of occupied locations decreases; if the simulation runs long enough, eventually all agents will occupy the same location.89238924So this simulation starts to explain adaptation: increasing fitness means that the species is getting better at surviving in its environment. But the number of occupied locations decreases over time, so this model does not explain increasing diversity at all.89258926\index{adaptation}89278928In the notebook for this chapter, you will see the effect of differential reproduction. As you might expect, differential reproduction also increases mean fitness. But without mutation, we still don't see increasing diversity.892989308931\section{Mutation}89328933In the simulations so far, we start with the maximum possible diversity --- one agent at every location in the landscape --- and end with the minimum possible diversity, all agents at one location.89348935\index{mutation}8936\index{diversity}89378938That's almost the opposite of what happened in the natural world, which apparently began with a single species that branched, over time, into the millions, or possibly billions, of species on Earth today (see \url{https://thinkcomplex.com/bio}).89398940With perfect copying in our model, we never see increasing diversity. But if we add mutation, along with differential survival and reproduction, we get a step closer to understanding evolution in nature.89418942Here is a class definition that extends \py{Agent} and overrides \py{copy}:89438944\begin{code}8945class Mutant(Agent):89468947def copy(self, prob_mutate=0.05)::8948if np.random.random() > prob_mutate:8949loc = self.loc.copy()8950else:8951direction = np.random.randint(self.fit_land.N)8952loc = self.mutate(direction)8953return Mutant(loc, self.fit_land)8954\end{code}89558956\index{NumPy}8957\index{random}89588959In this model of mutation, every time we call \py{copy}, there is a 5\% chance of mutation. In case of mutation, we choose a random direction from the current location --- that is, a random bit in the genotype --- and flip it. Here's \py{mutate}:89608961\begin{code}8962def mutate(self, direction):8963new_loc = self.loc.copy()8964new_loc[direction] ^= 18965return new_loc8966\end{code}89678968The operator \py{^=} computes ``exclusive OR"; with the operand 1, it8969has the effect of flipping a bit (see8970\url{https://thinkcomplex.com/xor}).89718972\index{exclusive OR}8973\index{XOR}89748975Now that we have mutation, we don't have to start with an agent at every location. Instead, we can start with the minimum variability: all agents at the same location.89768977\begin{figure}8978\centerline{\includegraphics[height=3in]{figs/chap11-3.pdf}}8979\caption{Mean fitness over time for 10 simulations with mutation and differential survival and reproduction.}8980\label{chap11-3}8981\end{figure}89828983Figure~\ref{chap11-3} shows the results of 10 simulations with mutation and differential survival and reproduction. In every case, the population evolves toward the location with maximum fitness.89848985\begin{figure}8986\centerline{\includegraphics[height=3in]{figs/chap11-4.pdf}}8987\caption{Number of occupied locations over time for 10 simulations with mutation and differential survival and reproduction.}8988\label{chap11-4}8989\end{figure}89908991To measure diversity in the population, we can plot the number of occupied locations after each time step. Figure~\ref{chap11-4} shows the results. We start with 100 agents at the same location. As mutations occur, the number of occupied locations increases quickly.89928993When an agent discovers a high-fitness location, it is more likely to survive and reproduce. Agents at lower-fitness locations eventually die out. Over time, the population migrates through the landscape until most agents are at the location with the highest fitness.89948995\index{equilibrium}89968997At that point, the system reaches an equilibrium where mutation occupies new locations at the same rate that differential survival causes lower-fitness locations to be left empty.89988999The number of occupied locations in equilibrium depends on the mutation rate and the degree of differential survival. In these simulations the number of unique occupied locations at any point is typically 5--15.90009001It is important to remember that the agents in this model don't move, just as the genotype of an organism doesn't change. When an agent dies, it can leave a location unoccupied. And when a mutation occurs, it can occupy a new location. As agents disappear from some locations and appear in others, the population migrates across the landscape, like a glider in Game of Life. But organisms don't evolve; populations do.90029003\index{glider}9004\index{Game of Life}90059006\section{Speciation}9007\label{speciation}90089009The theory of evolution says that natural selection changes existing species and creates new ones. In our model, we have seen changes, but we have not seen a new species. It's not even clear, in the model, what a new species would look like.90109011\index{speciation}90129013Among species that reproduce sexually, two organisms are considered the same species if they can breed and produce fertile offspring. But the agents in the model don't reproduce sexually, so this definition doesn't apply.90149015Among organisms that reproduce asexually, like bacteria, the definition of species is not as clear-cut. Generally, a population is considered a species if their genotypes form a cluster, that is, if the genetic differences within the population are small compared to the differences between populations.90169017Before we can model new species, we need the ability to identify clusters of agents in the landscape, which means we need a definition of {\bf distance} between locations. Since locations are represented with arrays of bits, we'll define distance as the number of bits that differ between locations. \py{FitnessLandscape} provides a \py{distance} method:90189019% Q: Why didn't I use the ^ operator in distance()?9020% A: Because ^ is bitwise XOR. In this case the result would be9021% the same, but logical_xor is more precisely what is called for.90229023\begin{code}9024# class FitnessLandscape90259026def distance(self, loc1, loc2):9027return np.sum(np.logical_xor(loc1, loc2))9028\end{code}90299030\begin{figure}9031\centerline{\includegraphics[height=3in]{figs/chap11-5.pdf}}9032\caption{Mean distance between agents over time.}9033\label{chap11-5}9034\end{figure}90359036The \py{logical_xor} function computes ``exclusive OR", which is \py{True} for bits that differ, and \py{False} for the bits that are the same.90379038\index{exclusive OR}9039\index{XOR}90409041To quantify the dispersion of a population, we can compute the mean of the distances between pairs of agents. In the notebook for this chapter, you'll see the \py{MeanDistance} instrument, which computes this metric after each time step.90429043\index{distance}90449045Figure~\ref{chap11-5} shows mean distance between agents over time. Because we start with identical mutants, the initial distances are 0. As mutations occur, mean distance increases, reaching a maximum while the population migrates across the landscape.90469047Once the agents discover the optimal location, mean distance decreases until the population reaches an equilibrium where increasing distance due to mutation is balanced by decreasing distance as agents far from the optimal location disappear. In these simulations, the mean distance in equilibrium is near 1; that is, most agents are only one mutation away from optimal.90489049Now we are ready to look for new species. To model a simple kind of speciation, suppose a population evolves in an unchanging environment until it reaches steady state (like some species we find in nature that seem to have changed very little over long periods of time).90509051\index{speciation}90529053Now suppose we either change the environment or transport the population to a new environment. Some features that increased fitness in the old environment might decrease it in the new environment, and vice versa.90549055We can model these scenarios by running a simulation until the population reaches steady state, then changing the fitness landscape, and then resuming the simulation until the population reaches steady state again.90569057\begin{figure}9058\centerline{\includegraphics[height=3in]{figs/chap11-6.pdf}}9059\caption{Mean fitness over time. After 500 time steps, we change the fitness landscape.}9060\label{chap11-6}9061\end{figure}90629063Figure~\ref{chap11-6} shows results from a simulation like that. We start with 100 identical mutants at a random location, and run the simulation for 500 time steps. At that point, many agents are at the optimal location, which has fitness near 0.65 in this example. The genotypes of the agents form a cluster, with the mean distance between agents near 1.90649065After 500 steps, we run \py{FitnessLandscape.set_values}, which changes the fitness landscape; then we resume the simulation. Mean fitness is lower, as we expect because the optimal location in the old landscape is no better than a random location in the new landscape.90669067After the change, mean fitness increases again as the population migrates across the new landscape, eventually finding the new optimum, which has fitness near 0.75 (which happens to be higher in this example, but needn't be).90689069Once the population reaches steady state, it forms a new cluster, with mean distance between agents near 1 again.90709071Now if we compute the distance between the agents' locations before and after the change, they differ by more than 6, on average. The distances between clusters are much bigger than the distances between agents in each cluster, so we can interpret these clusters as distinct species.907290739074\section{Summary}90759076We have seen that mutation, along with differential survival and reproduction, is sufficient to cause increasing fitness, increasing diversity, and a simple form of speciation. This model is not meant to be realistic; evolution in natural systems is much more complicated than this. Rather, it is meant to be a ``sufficiency theorem"; that is, a demonstration that the features of the model are sufficient to produce the behavior we are trying to explain (see \url{https://thinkcomplex.com/suff}).90779078\index{sufficiency theorem}90799080Logically, this ``theorem" doesn't prove that evolution in nature is caused by these mechanisms alone. But since these mechanisms do appear, in many forms, in biological systems, it is reasonable to think that they at least contribute to natural evolution.90819082Likewise, the model does not prove that these mechanisms always cause evolution. But the results we see here turn out to be robust: in almost any model that includes these features --- imperfect replicators, variability, and differential reproduction --- evolution happens.90839084\index{replicator}9085\index{variability}9086\index{differential reproduction}90879088I hope this observation helps to demystify evolution. When we look at natural systems, evolution seems complicated. And because we primarily see the results of evolution, with only glimpses of the process, it can be hard to imagine and hard to believe.90899090But in simulation, we see the whole process, not just the results. And by including the minimal set of features to produce evolution --- temporarily ignoring the vast complexity of biological life --- we can see evolution as the surprisingly simple, inevitable idea that it is.90919092%TODO: A reviewer suggests that I have made the case for "simple" but9093% not "inevitable".909490959096\section{Exercises}90979098The code for this chapter is in the Jupyter notebook {\tt chap11.ipynb}9099in the repository for this book. Open the notebook, read the code,9100and run the cells. You can use the notebook to work on the following9101exercises. My solutions are in {\tt chap11soln.ipynb}.91029103\begin{exercise}91049105The notebook shows the effects of differential reproductions and survival separately. What if you have both? Write a class called \py{SimWithBoth} that uses the version of \py{choose_dead} from \py{SimWithDiffSurvival}9106and the version of \py{choose_replacements} from \py{SimWithDiffReproduction}. Does mean fitness increase more quickly?91079108As a Python challenge, can you write this class without copying code?91099110\end{exercise}911191129113\begin{exercise}91149115When we change the landscape as in Section~\ref{speciation}, the number of occupied locations and the mean distance usually increase, but the effect is not always big enough to be obvious. Try out some different random seeds to see how general the effect is.91169117\end{exercise}9118911991209121\chapter{Evolution of cooperation}91229123In this final chapter, I take on two questions, one from biology and one from philosophy:91249125\begin{itemize}91269127\item In biology, the ``problem of altruism" is the apparent conflict between natural selection, which suggests that animals live in a state of constant competition, and altruism, which is the tendency of many animals to help other animals, even to their own detriment. See \url{https://thinkcomplex.com/altruism}.91289129\item In moral philosophy, the question of human nature asks whether humans are fundamentally good, or evil, or blank states shaped by their environment. See \url{https://thinkcomplex.com/nature}.91309131\end{itemize}91329133The tools I use to address these questions are agent-based simulation (again) and game theory, which is a set of abstract models meant to describe ways agents interact. Specifically, the game we will consider is the Prisoner's Dilemma.91349135\index{altruism}9136\index{human nature}91379138The code for this chapter is in \py{chap12.ipynb}, which is a9139Jupyter notebook in the repository for this book. For more information9140about working with this code, see Section~\ref{code}.914191429143\section{Prisoner's Dilemma}9144\label{prisoners}91459146The Prisoner's Dilemma is a topic in game theory, but it's not the fun kind of game. Instead, it is the kind of game that sheds light on human motivation and behavior. Here is the presentation of the dilemma from Wikipedia (\url{https://thinkcomplex.com/pd}):91479148\index{Prisoner's Dilemma}9149\index{game theory}91509151\begin{quote}9152Two members of a criminal gang are arrested and imprisoned. Each prisoner is in solitary confinement with no means of communicating with the other. The prosecutors lack sufficient evidence to convict the pair on the principal charge, but they have enough to convict both on a lesser charge. Simultaneously, the prosecutors offer each prisoner a bargain. Each prisoner is given the opportunity to either: (1) betray the other by testifying that the other committed the crime, or (2) cooperate with the other by remaining silent. The offer is:91539154\begin{itemize}91559156\item If A and B each betray the other, each of them serves 2 years in prison.91579158\item If A betrays B but B remains silent, A will be set free and B will serve 3 years in prison (and vice versa).91599160\item If A and B both remain silent, both of them will only serve 1 year in prison (on the lesser charge).91619162\end{itemize}91639164\end{quote}91659166Obviously, this scenario is contrived, but it is meant to represent a variety of interactions where agents have to choose whether to ``cooperate" with each other or ``defect", and where the reward (or punishment) for each agent depends on what the other chooses.91679168\index{cooperate}9169\index{defect}91709171With this set of punishments, it is tempting to say that the players should cooperate, that is, that both should remain silent. But neither agent knows what the other will do, so each has to consider two possible outcomes. First, looking at it from A's point of view:91729173\begin{itemize}91749175\item If B remains silent, A is better off defecting; she would go free rather than serve 1 year.91769177\item If B defects, A is still better off defecting; she would serve only 2 years rather than 3.91789179\end{itemize}91809181No matter what B does, A is better off defecting.9182And because the game is symmetric, this analysis is the same from B's point of view: no matter what A does, B is better off defecting.91839184In the simplest version of this game, we assume that A and B have no other considerations to take into account. They can't communicate with each other, so they can't negotiate, make promises, or threaten each other. And they consider only the immediate goal of minimizing their sentences; they don't take into account any other factors.91859186\index{rational}91879188Under those assumptions, the rational choice for both agents is to defect. That might be a good thing, at least for purposes of criminal justice. But for the prisoners, it is frustrating because there is, apparently, nothing they can do to achieve the outcome they both want. And this model applies to other scenarios in real life where cooperation would be better for the greater good as well as for the players.91899190Studying these scenarios, and ways to escape from the dilemma, is the focus of people who study game theory, but it is not the focus of this chapter. We are headed in a different direction.919191929193\section{The problem of nice}91949195Since the Prisoner's Dilemma was first discussed in the 1950s, it has been a popular topic of study in social psychology. Based on the analysis in the previous section, we can say what a perfectly rational agent {\em should} do; it is harder to predict what real people actually do. Fortunately, the experiment has been done\footnote{Here's a recent report with references to previous experiments:9196Barreda-Tarrazona, Jaramillo-Guti\'{e}rrez, Pavan, and Sabater-Grande,9197``Individual Characteristics vs. Experience: An Experimental Study on Cooperation in Prisoner's Dilemma", Frontiers in Psychology, 2017; 8: 596.9198\url{https://thinkcomplex.com/pdexp}.}.91999200\index{social psychology}92019202If we assume that people are smart enough to do the analysis (or understand it when explained), and that they generally act in their own interest, we would expect them to defect pretty much all the time. But they don't. In most experiments, subjects cooperate much more than the rational agent model predicts\footnote{For an excellent video summarizing what we have discussed so far, see \url{https://thinkcomplex.com/pdvid1}.}.92039204\index{rational agent model}92059206The most obvious explanation of this result is that people are not rational agents, which should not be a surprise to anyone. But why not? Is it because they are not smart enough to understand the scenario or because they are knowingly acting contrary to their own interest?92079208Based on experimental results, it seems that at least part of the explanation is plain altruism: many people are willing to incur a cost to themselves in order to benefit another person. Now, before you nominate that conclusion for publication in the {\it Journal of Obvious Results}, let's keep asking why:92099210\index{altruism}92119212\begin{itemize}92139214\item Why do people help other people, even at a cost to themselves? At least part of the reason is that they want to; it makes them feel good about themselves and the world.92159216\item And why does being nice make people feel good? It might be tempting to say that they were raised right, or more generally trained by society to want to do good things. But there is little doubt that some part of altruism is innate; a proclivity for altruism is the result of normal brain development.92179218\item Well, why is that? The innate parts of brain development, and the personal characteristics that follow, are the result of genetic information. Of course, the relationship between genes and altruism is complicated; there are probably many genes that interact with each other and with environmental factors to cause people to be more or less altruistic in different circumstances. Nevertheless, there are almost certainly genes that tend to make people altruistic.92199220\item Finally, why is that? If, under natural selection, animals are in constant competition with each other to survive and reproduce, it seems obvious that altruism would be counterproductive. In a population where some people help others, even to their own detriment, and others are purely selfish, it seems like the selfish ones would benefit, the altruistic ones would suffer, and the genes for altruism would be driven to extinction.92219222\end{itemize}92239224This apparent contradiction is the ``problem of altruism": {\em why haven't the genes for altruism died out}?92259226\index{problem of altruism}92279228Among biologists, there are many possible explanations, including reciprocal altruism, sexual selection, kin selection, and group selection. Among non-scientists, there are even more explanations. I leave it to you to explore the alternatives; for now I want to focus on just one explanation, arguably the simplest one: maybe altruism is adaptive. In other words, maybe genes for altruism make people more likely to survive and reproduce.92299230It turns out that the Prisoner's Dilemma, which raises the problem of altruism, might also help resolve it.923192329233\section{Prisoner's dilemma tournaments}92349235In the late 1970s Robert Axelrod, a political scientist at the University of Michigan, organized a tournament to compare strategies for playing Prisoner's Dilemma (PD).92369237\index{Axelrod, Robert}9238\index{tournament}92399240He invited participants to submit strategies in the form of computer programs, then played the programs against each other and kept score. Specifically, they played the iterated version of PD, in which the agents play multiple rounds against the same opponent, so their decisions can be based on history.92419242In Axelrod's tournaments, a simple strategy that did surprisingly well was called ``tit for tat", or TFT. TFT always cooperates during the first round of an iterated match; after that, it copies whatever the opponent did during the previous round. If the opponent keeps cooperating, TFT keeps cooperating. If the opponent defects at any point, TFT defects in the next round. But if the opponent goes back to cooperating, so does TFT.92439244\index{TFT}9245\index{tit for tat}92469247For more information about these tournaments, and an explanation of why TFT does so well, see this video: \url{https://thinkcomplex.com/pdvid2}.92489249Looking at the strategies that did well in these tournaments, Alexrod identified the characteristics they tended to share:92509251\begin{itemize}92529253\item Nice: The strategies that do well cooperate during the first round, and generally cooperate as often as they defect in subsequent rounds.92549255\item Retaliating: Strategies that cooperate all the time did not do as well as strategies that retaliate if the opponent defects.92569257\item Forgiving: But strategies that were too vindictive tended to punish themselves as well as their opponents.92589259\item Non-envious: Some of the most successful strategies seldom outscore their opponents; they are successful because they do {\em well enough} against a wide variety of opponents.92609261\end{itemize}92629263TFT has all of these properties.92649265Axelrod's tournaments offer a partial, possible answer to the problem of altruism: maybe the genes for altruism are prevalent because they are adaptive. To the degree that many social interactions can be modeled as variations on the Prisoner's Dilemma, a brain that is wired to be nice, tempered by a balance of retaliation and forgiveness, will tend to do well in a wide variety of circumstances.92669267But the strategies in Axelrod's tournaments were designed by people; they didn't evolve. We need to consider whether it is credible that genes for niceness, retribution, and forgiveness could appear by mutation, successfully invade a population of other strategies, and resist being invaded by subsequent mutations.926892699270\section{Simulating evolution of cooperation}92719272{\em Evolution of Cooperation} is the title of the first book where Axelrod presented results from Prisoner's Dilemma tournaments and discussed the implications for the problem of altruism. Since then, he and other researchers have explored the evolutionary dynamics of PD tournaments, that is, how the distribution of strategies changes over time in a population of PD contestants. In the rest of this chapter, I run a version of those experiments and present the results.92739274\index{Evolution of Cooperation@{\it Evolution of Cooperation}}92759276First, we'll need a way to encode a PD strategy as a genotype. For this experiment, I consider strategies where the agent's choice in each round depends only on the opponent's choice in the previous two rounds. I represent a strategy using a dictionary that maps from the opponent's previous two choices to the agent's next choice.92779278\index{strategy}92799280Here is the class definition for these agents:92819282\begin{code}9283class Agent:92849285keys = [(None, None),9286(None, 'C'),9287(None, 'D'),9288('C', 'C'),9289('C', 'D'),9290('D', 'C'),9291('D', 'D')]92929293def __init__(self, values, fitness=np.nan):9294self.values = values9295self.responses = dict(zip(self.keys, values))9296self.fitness = fitness9297\end{code}92989299\py{keys} is the sequence of keys in each agent's dictionary, where the tuple \py{('C', 'C')} means that the opponent cooperated in the previous two rounds; \py{(None, 'C')} means that only one round has been played and the opponent cooperated; and \py{(None, None)} means that no rounds have been played.93009301\index{zip}93029303In the \py{__init__} method, \py{values} is a sequence of choices, either \py{'C'} or \py{'D'}, that correspond to \py{keys}. So if the first element of \py{values} is \py{'C'}, that means that this agent will cooperate in the first round. If the last element of \py{values} is \py{'D'}, this agent will defect if the opponent defected in the previous two rounds.93049305In this implementation, the genotype of an agent who always defects is \py{'DDDDDDD'}; the genotype of an agent who always cooperates is \py{'CCCCCCC'}, and the genotype for TFT is \py{'CCDCDCD'}.93069307\index{TFT}93089309The \py{Agent} class provides \py{copy}, which makes another agent with the same genotype, but with some probability of mutation:93109311\begin{code}9312def copy(self, prob_mutate=0.05):9313if np.random.random() > prob_mutate:9314values = self.values9315else:9316values = self.mutate()9317return Agent(values, self.fitness)9318\end{code}93199320\index{NumPy}9321\index{random}9322\index{choice}93239324Mutation works by choosing a random value in the genotype and flipping from \py{'C'} to \py{'D'}, or vice versa:93259326\begin{code}9327def mutate(self):9328values = list(self.values)9329index = np.random.choice(len(values))9330values[index] = 'C' if values[index] == 'D' else 'D'9331return values9332\end{code}93339334Now that we have agents, we need a tournament.933593369337\section{The Tournament}93389339The \py{Tournament} class encapsulates the details of the PD competition:93409341\begin{code}9342payoffs = {('C', 'C'): (3, 3),9343('C', 'D'): (0, 5),9344('D', 'C'): (5, 0),9345('D', 'D'): (1, 1)}93469347num_rounds = 693489349def play(self, agent1, agent2):9350agent1.reset()9351agent2.reset()93529353for i in range(self.num_rounds):9354resp1 = agent1.respond(agent2)9355resp2 = agent2.respond(agent1)93569357pay1, pay2 = self.payoffs[resp1, resp2]93589359agent1.append(resp1, pay1)9360agent2.append(resp2, pay2)93619362return agent1.score, agent2.score9363\end{code}93649365\py{payoffs} is a dictionary that maps from the agents' choices to their rewards. For example, if both agents cooperate, they each get 3 points. If one defects and the other cooperates, the defector gets 5 and the cooperator gets 0. If they both defect, each gets 1. These are the payoffs Axelrod used in his tournaments.93669367\index{tournament}93689369The \py{play} method runs several rounds of the PD game. It uses the following methods from the \py{Agent} class:93709371\begin{itemize}93729373\item \py{reset}: Initializes the agents before the first round, resetting their scores and the history of their responses.93749375\item \py{respond}: Asks each agent for their response, given the opponent's previous responses.93769377\item \py{append}: Updates each agent by storing the choices and adding up the scores from successive rounds.93789379\end{itemize}93809381After the given number of rounds, \py{play} returns the total score for each agent. I chose \py{num_rounds=6} so that each element of the genotype is accessed with roughly the same frequency. The first element is only accessed during the first round, or one sixth of the time. The next two elements are only accessed during the second round, or one twelfth each. The last four elements are accessed four of six times, or one sixth each, on average.93829383\index{melee}93849385\py{Tournament} provides a second method, \py{melee}, that determines which agents compete against each other:93869387\begin{code}9388def melee(self, agents, randomize=True):9389if randomize:9390agents = np.random.permutation(agents)93919392n = len(agents)9393i_row = np.arange(n)9394j_row = (i_row + 1) % n93959396totals = np.zeros(n)93979398for i, j in zip(i_row, j_row):9399agent1, agent2 = agents[i], agents[j]9400score1, score2 = self.play(agent1, agent2)9401totals[i] += score19402totals[j] += score294039404for i in i_row:9405agents[i].fitness = totals[i] / self.num_rounds / 29406\end{code}94079408\py{melee} takes a list of agents and a boolean, \py{randomize}, that determines whether each agent fights the same neighbors every time, or whether the pairings are randomized.94099410\index{NumPy}9411\index{arange}94129413\py{i_row} and \py{j_row} contain the indices of the pairings. \py{totals} contains the total score of each agent.94149415Inside the loop, we select two agents, invoke \py{play}, and update \py{totals}. At the end, we compute the average number of points each agent got, per round and per opponent, and store the results in the \py{fitness} attribute of each agent.941694179418\section{The Simulation}94199420The \py{Simulation} class for this chapter is based on the one in Section~\ref{evosim}; the only differences are in \py{__init__} and \py{step}.94219422\index{simulation}94239424Here's the \py{__init__} method:94259426\begin{code}9427class PDSimulation(Simulation):94289429def __init__(self, tournament, agents):9430self.tournament = tournament9431self.agents = np.asarray(agents)9432self.instruments = []9433\end{code}94349435A \py{Simulation} object contains a \py{Tournament} object, a sequence of agents, and a sequence of \py{Instrument} objects (as in Section~\ref{instrument}).94369437Here's the \py{step} method:94389439\begin{code}9440def step(self):9441self.tournament.melee(self.agents)9442Simulation.step(self)9443\end{code}94449445This version of \py{step} uses \py{Tournament.melee}, which sets the \py{fitness} attribute for each agent; then it calls the \py{step} method from the \py{Simulation} class, reproduced here:94469447\begin{code}9448# class Simulation94499450def step(self):9451n = len(self.agents)9452fits = self.get_fitnesses()94539454# see who dies9455index_dead = self.choose_dead(fits)9456num_dead = len(index_dead)94579458# replace the dead with copies of the living9459replacements = self.choose_replacements(num_dead, fits)9460self.agents[index_dead] = replacements94619462# update any instruments9463self.update_instruments()9464\end{code}94659466\py{Simulation.step} collects the agents' fitnesses in an array; then it calls \py{choose_dead} to decide which agents die, and \py{choose_replacements} to decide which agents reproduce.94679468\index{differential survival}94699470My simulation includes differential survival, as in Section~\ref{diffsurv}, but not differential reproduction. You can see the details in the notebook for this chapter. As one of the exercises, you will have a chance to explore the effect of differential reproduction.947194729473\section{Results}94749475Suppose we start with a population of three agents: one always cooperates, one always defects, and one plays the TFT strategy. If we run \py{Tournament.melee} with this population, the cooperator gets 1.5 points per round, the TFT agent gets 1.9, and the defector gets 3.33. This result suggests that ``always defect" should quickly become the dominant strategy.94769477But ``always defect" contains the seeds of its own destruction. If nicer strategies are driven to extinction, the defectors have no one to take advantage of. Their fitness drops, and they become vulnerable to invasion by cooperators.94789479\index{equilibrium}94809481Based on this analysis, it is not easy to predict how the system will behave: will it find a stable equilibrium, or oscillate between various points in the genotype landscape? Let's run the simulation and find out!94829483I start with 100 identical agents who always defect, and run the simulation for 5000 steps:94849485\begin{code}9486tour = Tournament()9487agents = make_identical_agents(100, list('DDDDDDD'))9488sim = PDSimulation(tour, agents)9489\end{code}949094919492\begin{figure}9493\centerline{\includegraphics[height=3in]{figs/chap12-1.pdf}}9494\caption{Average fitness (points scored per round of Prisoner's Dilemma). }9495\label{chap12-1}9496\end{figure}94979498Figure~\ref{chap12-1} shows mean fitness over time (using the \py{MeanFitness} instrument from Section~\ref{instrument}). Initially mean fitness is 1, because when defectors face each other, they get only 1 point each per round.94999500After about 500 time steps, mean fitness increases to nearly 3, which9501is what cooperators get when they face each other. However, as we9502suspected, this situation in unstable. Over the next 500 steps, mean9503fitness drops below 2, climbs back toward 3, and continues to9504oscillate.95059506The rest of the simulation is highly variable, but with the exception of one big drop, mean fitness is usually between 2 and 3, with the long-term mean close to 2.5.95079508And that's not bad! It's not quite a utopia of cooperation, which would average 3 points per round, but it's a long way from the dystopia of perpetual defection. And it's a lot better than what we might expect from the natural selection of self-interested agents.95099510\index{cooperation}95119512To get some insight into this level of fitness, let's look at a few more instruments. \py{Niceness} measures the fraction of cooperation in the genotypes of the agents after each time step:95139514\begin{code}9515class Niceness(Instrument):95169517def update(self, sim):9518responses = np.array([agent.values9519for agent in sim.agents])9520metric = np.mean(responses == 'C')9521self.metrics.append(metric)9522\end{code}95239524\py{responses} is an array with one row for each agent and one column for each element of the genome. \py{metric} is the fraction of elements that are \py{'C'}, averaged across agents.95259526\index{NumPy}9527\index{mean}95289529\begin{figure}9530\centerline{\includegraphics[height=3in,width=7in]{figs/chap12-2.pdf}}9531\caption{Average niceness across all genomes in the population (left), and fraction of population that cooperates in the first round (right).}9532\label{chap12-2}9533\end{figure}95349535Figure~\ref{chap12-2} (left) shows the results: starting from 0, average niceness increases quickly to 0.75, then oscillates between 0.4 and 0.85, with a long-term mean near 0.65. Again, that's a lot of niceness!95369537Looking specifically at the opening move, we can track the fraction of agents that cooperate in the first round. Here's the instrument:95389539\begin{code}9540class Opening(Instrument):95419542def update(self, sim):9543responses = np.array([agent.values[0]9544for agent in sim.agents])9545metric = np.mean(responses == 'C')9546self.metrics.append(metric)9547\end{code}95489549Figure~\ref{chap12-2} (right) shows the results, which are highly variable. The fraction of agents who cooperate in the first round is often near 1, and occasionally near 0. The long-term average is close to 0.65, similar to overall niceness. These results are consistent with Axelrod's tournaments; in general, nice strategies do well.95509551The other characteristics Axelrod identifies in successful strategies are retaliation and forgiveness. To measure retaliation, I define this instrument:95529553\begin{code}9554class Retaliating(Instrument):95559556def update(self, sim):9557after_d = np.array([agent.values[2::2]9558for agent in sim.agents])9559after_c = np.array([agent.values[1::2]9560for agent in sim.agents])9561metric = np.mean(after_d=='D') - np.mean(after_c=='D')9562self.metrics.append(metric)9563\end{code}95649565\py{Retaliating} compares the number of elements in all genomes where an agent defects after the opponent defects (elements 2, 4, and 6) with the number of places where an agents defects after the opponent cooperates. As you might expect by now, the results vary substantially (you can see the graph in the notebook). On average the difference between these fractions is less than 0.1, so if agents defect 30\% of the time after the opponent cooperates, they might defect 40\% of the time after a defection.95669567This result provides weak support for the claim that successful strategies retaliate. But maybe it's not necessary for all agents, or even many, to be retaliatory; if there is at least some tendency toward retaliation in the population as a whole, that might be enough to prevent high-defection strategies from gaining ground.95689569\index{retaliation}9570\index{forgiveness}95719572To measure forgiveness, I define one more instrument to see whether agents might be more likely to cooperate after D-C in the previous two rounds, compared to C-D. In my simulations, there is no evidence for this particular kind of forgiveness. On the other hand, the strategies in these simulations are necessarily forgiving because they consider only the previous two rounds of history. In this context, forgetting is a kind of forgiving.957395749575\section{Conclusions}95769577Axelrod's tournaments suggest a possible resolution to the problem of altruism: maybe being nice, but not {\em too} nice, is adaptive. But the strategies in the original tournaments were designed by people, not evolution, and the distribution of strategies did not change over the course of the tournaments.95789579So that raises a question: strategies like TFT might do well in a fixed population of human-designed strategies, but can they evolve? In other words, can they appear in a population through mutation, compete successfully with their ancestors, and resist invasion by their descendants?95809581The simulations in this chapter suggest:95829583\begin{itemize}95849585\item Populations of defectors are vulnerable to invasion by nicer strategies.95869587\item Populations that are too nice are vulnerable to invasion by defectors.95889589\item As a result, the average level of niceness oscillates, but the average amount of niceness is generally high, and the average level of fitness is generally closer to a utopia of cooperation than to a dystopia of defection.95909591\item TFT, which was a successful strategy in Alexrod's tournaments, does not seem to be a specially optimal strategy in an evolving population. In fact, there is probably no stable optimal strategy.95929593\item Some degree of retaliation may be adaptive, but it might not be9594necessary for all agents to retaliate. If there is enough retaliation in the population as a whole, that might be enough to prevent invasion by defectors\footnote{And that introduces a whole new topic in game theory, the free-rider problem (see \url{https://thinkcomplex.com/rider})}.95959596\end{itemize}95979598Obviously, the agents in these simulations are simple, and the Prisoner's Dilemma is a highly abstract model of a limited range of social interactions. Nevertheless, the results in this chapter provide some insight into human nature. Maybe our inclinations toward cooperation, retaliation, and forgiveness are innate, at least in part. These characteristics are a result of how our brains are wired, which is controlled by our genes, at least in part. And maybe our genes build our brains that way because over the history of human evolution, genes for less altruistic brains were less likely to propagate.95999600\index{altruism}96019602Maybe that's why selfish genes build altruistic brains.960396049605\section{Exercises}96069607The code for this chapter is in the Jupyter notebook {\tt chap12.ipynb}9608in the repository for this book. Open the notebook, read the code,9609and run the cells. You can use this notebook to work on the9610following exercises. My solutions are in {\tt chap12soln.ipynb}.961196129613\begin{exercise}96149615The simulations in this chapter depend on conditions and parameters I chose arbitrarily. As an exercise, I encourage you to explore other conditions to see what effect they have on the results. Here are some suggestions:96169617\begin{enumerate}96189619\item Vary the initial conditions: instead of starting with all defectors, see what happens if you start with all cooperators, all TFT, or random agents.96209621\item In \py{Tournament.melee}, I shuffle the agents at the beginning of each time step, so each agent plays against two randomly-chosen agents. What happens if you don't shuffle? In that case, each agent plays against the same neighbors repeatedly. That might make it easier for a minority strategy to invade a majority, by taking advantage of locality.96229623\item Since each agent only plays against two other agents, the outcome of each round is highly variable: an agent that would do well against most other agents might get unlucky during any given round, or the other way around. What happens if you increase the number of opponents each agent plays against during each round? Or what if an agent's fitness at the end of each step is the average of its current score and its fitness at the end of the previous round?96249625\item The function I chose for \py{prob_survival} varies from 0.7 to 0.9, so the least fit agent, with \py{p=0.7}, lives for 3.33 time steps on average, and the most fit agent lives for 10 time steps. What happens if you make the degree of differential survival more or less ``aggressive"?96269627\item I chose \py{num_rounds=6} so that each element of the genome has roughly the same impact on the outcome of a match. But that is substantially shorter than what Alexrod used in his tournaments. What happens if you increase \py{num_rounds}? Note: if you explore the effect of this parameter, you might want to modify \py{Niceness} to measure the niceness of the last 4 elements of the genome, which will be under more selective pressure as \py{num_rounds} increases.96289629\item My implementation has differential survival but not differential reproduction. What happens if you add differential reproduction?96309631\end{enumerate}96329633\end{exercise}96349635\begin{exercise}96369637In my simulations, the population never converges to a state where a majority share the same, presumably optimal, genotype. There are two possible explanations for this outcome: one is that there is no optimal strategy, because whenever the population is dominated by a majority genotype, that condition creates an opportunity for a minority to invade; the other possibility is that the mutation rate is high enough to maintain a diversity of genotypes.96389639To distinguish between these explanations, try lowering the mutation rate to see what happens. Alternatively, start with a random population and run without mutation until only one genotype survives. Or run with mutation until the system reaches something like a steady state; then turn off mutation and run until there is only one surviving genotype. What are the characteristics of the genotypes that prevail in these conditions?96409641\end{exercise}964296439644\begin{exercise}96459646The agents in my experiment are ``reactive'' in the sense that their9647choice during each round depends only on what the opponent did during9648previous rounds. Explore strategies that also take into9649account the agent's past choices. These strategies can9650distinguish an opponent who retaliates from an opponent who defects9651without provocation.96529653\index{reactive}96549655\end{exercise}9656965796589659\appendix96609661\chapter{Reading list}9662\label{reading}96639664The following are selected books related to topics in this book. Most are written for a non-technical audience.966596669667\begin{itemize}96689669\item Axelrod, Robert, {\it Complexity of Cooperation},9670Princeton University Press, 1997.96719672\item Axelrod, Robert{\it The Evolution of Cooperation},9673Basic Books, 2006.96749675\item Bak, Per {\it How Nature Works},9676Copernicus (Springer), 1996.96779678\item \Barabasi, Albert-L\'{a}szl\'{o}, {\it Linked},9679Perseus Books Group, 2002.96809681\item Buchanan, Mark, {\it Nexus},9682W.~W.~Norton \& Company, 2002.96839684\item Dawkins, Richard, {\it The Selfish Gene},9685Oxford University Press, 2016.96869687\item Epstein, Joshua and Axtell, Robert, {\it Growing Artificial Societies},9688Brookings Institution Press \& MIT Press, 1996.96899690\item Fisher, Len, {\it The Perfect Swarm},9691Basic Books, 2009.96929693\item Flake, Gary William, {\it The Computational Beauty of Nature},9694MIT Press 2000.96959696\item Goldstein, Rebecca, {\it Incompleteness},9697W.~W.~Norton \& Company, 2005.96989699\item Goodwin, Brian {\it How the Leopard Changed Its Spots},9700Princeton University Press, 2001.97019702\item Holland, John, {\it Hidden Order},9703Basic Books, 1995.97049705\item Johnson, Steven, {\it Emergence},9706Scribner, 2001.97079708\item Kelly, Kevin, {\it Out of Control},9709Basic Books, 2002.97109711\item Kosko, Bart, {\it Fuzzy Thinking},9712Hyperion, 1993.97139714\item Levy, Steven {\it Artificial Life},9715Pantheon, 1992.97169717\item Mandelbrot, Benoit, {\it Fractal Geometry of Nature},9718Times Books, 1982.97199720\item McGrayne, Sharon Bertsch, {\it The Theory That Would Not Die},9721Yale University Press, 2011.97229723\item Mitchell, Melanie, {\it Complexity: A Guided Tour}.9724Oxford University Press, 2009.97259726\item Waldrop, M. Mitchell {\it Complexity: The Emerging Science at the Edge of Order and Chaos},9727Simon \& Schuster, 1992.97289729\item Resnick, Mitchell, {\it Turtles, Termites, and Traffic Jams},9730Bradford, 1997.97319732\item Rucker, Rudy, {\it The Lifebox, the Seashell, and the Soul},9733Thunder's Mouth Press, 2005.97349735\item Sawyer, R. Keith, {\it Social Emergence: Societies as Complex Systems},9736Cambridge University Press, 2005.97379738\item Schelling, Thomas, {\it Micromotives and Macrobehaviors},9739W.~W.~Norton \& Company, 2006.97409741\item Strogatz, Steven, {\it Sync},9742Hachette Books, 2003.97439744\item Watts, Duncan, {\it Six Degrees},9745W.~W.~Norton \& Company, 2003.97469747\item Wolfram, Stephen, {\it A New Kind Of Science},9748Wolfram Media, 2002.97499750\end{itemize}9751975297539754\backmatter9755\printindex97569757\afterpage{\blankpage}975897599760\end{document}976197629763\section{Pareto distributions}97649765The Pareto distribution is named after the economist Vilfredo9766Pareto, who used it to describe the distribution of wealth;9767see \url{https://thinkcomplex.com/pareto}. Since then,9768people have used it to describe9769phenomena in the natural and social sciences9770including sizes of cities and towns, sand particles9771and meteorites, forest fires and earthquakes.9772\index{Pareto distribution}9773\index{Pareto, Vilfredo}97749775The Pareto distribution is characterized by a CDF with the following9776form:9777%9778\[ CDF(x) = 1 - \left( \frac{x}{x_m} \right) ^{-\alpha} \]9779%9780The parameters $x_m$ and $\alpha$ determine the location and shape of9781the distribution. $x_m$ is the minimum possible quantity.9782\index{parameter}97839784Values from a Pareto distribution often have these properties:97859786\begin{description}97879788\item[Long tail:] Pareto distributions contain many small values and a9789few very large ones.9790\index{long tail}97919792\item[80/20 rule:] The large values in a Pareto distribution are so9793large that they make up a disproportionate share of the total. In9794the context of wealth, the 80/20 rule says that 20\% of the people9795own 80\% of the wealth.9796\index{80/20 rule}97979798\item[Scale free:] Short-tailed distributions are centered around a9799typical size, which is called a ``scale''. For example, the great9800majority of adult humans are between 100 and 200 cm in height, so we9801could say that the scale of human height is a few hundred9802centimeters. But for heavy-tailed distributions, there is no9803similar range (bounded by a factor of two) that contains the bulk of9804the distribution. So we say that these distributions are9805``scale-free''.9806\index{scale-free}98079808\end{description}98099810To get a sense of the difference between the Pareto and Gaussian9811distributions, imagine what the world would be like if the9812distribution of human height were Pareto.98139814In Pareto World, the shortest person is 100 cm,9815and the median is 150 cm, so that part of the distribution is not9816very different from ours.98179818\index{Pareto World}98199820But if you generate 6 billion values from this distribution9821distribution, the tallest person might9822be 100 km --- that's what it means to9823be scale-free!98249825There is a simple visual test that indicates whether an empirical9826distribution is well-characterized by a Pareto distribution: on a9827log-log scale, the CCDF looks like a straight line. The derivation is9828similar to what we saw in the previous section.98299830The equation for the CCDF is:9831%9832\[ y = 1 - CDF(x) \sim \left( \frac{x}{x_m} \right) ^{-\alpha} \]9833%9834Taking the log of both sides yields:9835%9836\[ \log y \sim -\alpha (\log x - \log x_m ) \]9837%9838So if you plot $\log y$ versus $\log x$, it should look like a9839straight line with slope $-\alpha$ and intercept $\alpha \log x_m$.9840\index{log-log plot}98419842\begin{exercise}98439844Write a version of \py{plot_ccdf} that plots the complementary9845CCDF on a log-log scale.98469847To test your function, use \py{paretovariate} from the \py{random}9848module to generate 100 values from a Pareto distribution. Plot9849the CCDF on a log-$y$ scale and see if it falls on a straight line.9850What happens to the curve as you increase the number of values?9851\index{random module@\py{random} module}98529853\end{exercise}9854985598569857\section{Pareto and the power law}98589859Starting with a power-law distribution, we have:9860%9861\[ P(k) \sim k^{- \gamma} \]9862%9863If we choose a random node in a scale free network,9864$P(k)$ is the probability that its degree equals $k$.9865\index{degree}98669867The cumulative distribution function, $CDF(k)$, is the probability9868that the degree is less than or equal to $k$, so we can9869get that by summation:9870%9871\[ CDF(k) = \sum_{i=0}^k P(i) \]9872%9873For large values of $k$ we can approximate the summation with9874an integral:9875%9876\[ \sum_{i=0}^k i^{- \gamma} \sim \int_{i=0}^k i^{- \gamma} =9877\frac{1}{\gamma -1} (1 - k^{-\gamma + 1}) \]9878%9879To make this a proper CDF we could normalize it so that it9880goes to 1 as $k$ goes to infinity, but that's not necessary,9881because all we need to know is:9882%9883\[ CDF(k) \sim 1 - k^{-\gamma + 1} \]9884%9885Which shows that the distribution of $k$ is asymptotic to a9886Pareto distribution with $\alpha = \gamma - 1$.988798889889\section{Continuous distributions}98909891The distributions we have seen so far are sometimes called9892{\bf empirical distributions} because they are based on a9893dataset that comes from some kind of empirical observation.9894\index{empirical distribution}98959896An alternative is a {\bf continuous distribution},9897which is characterized by a CDF that is a continuous function.9898Some of these distributions, like the9899Gaussian or normal9900distribution, are well known, at least to people who have studied9901statistics. Many real world phenomena can be approximated by9902continuous distributions, which is why they are useful.9903\index{continuous distribution}9904\index{normal distribution}9905\index{Gaussian distribution}99069907For example, if you observe a mass of radioactive material with9908an instrument that can detect decay events, the distribution9909of times between events will most likely fit an exponential9910distribution. The same is true for any series where9911an event is equally likely at any time.9912\index{exponential distribution}99139914The CDF of the exponential distribution is:99159916\[ CDF(x) = 1 - e^{-\lambda x} \]99179918The parameter, $\lambda$, determines the mean and variance9919of the distribution. This equation can be used to derive9920a simple visual test for whether a dataset can be well9921approximated by an exponential distribution. All you9922have to do is plot the {\bf complementary distribution}9923on a log-$y$ scale.9924\index{parameter}9925\index{complementary distribution}99269927The complementary distribution (CCDF) is just $1 - CDF(x)$;9928if you plot the complementary distribution of a dataset9929that you think is exponential, you expect to see a function9930like:99319932\[ y = 1 - CDF(x) \sim e^{-\lambda x} \]99339934If you take the log of both sides of this equation, you get:9935%9936\[ \log y \sim -\lambda x \]9937%9938So on a log-$y$ scale the CCDF should look like a straight line9939with slope $-\lambda$.99409941\begin{exercise}99429943Write a function called \py{plot_ccdf} that takes9944a list of values and the corresponding list of probabilities9945and plots the CCDF on a log-$y$ scale.99469947To test your function, use \py{expovariate} from the \py{random}9948module to generate 100 values from an exponential distribution. Plot9949the CCDF on a log-$y$ scale and see if it falls on a straight line.9950\index{random module@\py{random} module}99519952\end{exercise}995399549955Leftover exercises:99569957Connecting grids and networks:99589959\begin{exercise}9960Implement percolation on a network.9961\end{exercise}99629963\begin{exercise}9964Check whether the clusters in Schelling's model are connected.9965\end{exercise}996699679968%\begin{figure}9969%\centerline{\includegraphics[height=1.75in]{figs/array.pdf}}9970%\caption{A list of lists (left) and a NumPy array (right).\label{fig.array}}9971%\end{figure}99729973Figure~\ref{fig.array} shows why.9974The diagram on the left shows a list of lists of integers; each9975dot represents a reference, which takes up 4--8 bytes. To access9976one of the integers, you have to follow two references.99779978\index{NumPy}9979\index{array}9980\index{nested list}9981\index{reference}99829983The diagram on the right shows an array of the same integers. Because9984the elements are all the same size, they can be stored contiguously in9985memory. This arrangement saves space because it doesn't use9986references, and it saves time because the location of an element can9987be computed directly from the indices; there is no need to follow a9988series of references.998999909991